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RECOMBINANT TOXIN FRAGMENTS 

• t on relates to recombinant toxin fragments, to DNA encoding these 
Z£ Z tlr uses such as in a — - * * - * - 
purposes. 

inhibitors of calcium-dependent 
neurotransmitter S ec,e» endoproteolytic Ceavage of at ieast one o, 

^i*::: ^s. i*- — - vamp ' 5v " ,ax,n °; 

* ree VeS ' C,e ° P c * mra , t0 the vesica docking and membrane fusion events o, 
SNA , 26 v^ a ca^ - targe(ing o) tetanus a „ d ^ 

neurotransmitter »««n ^ ^ lhe 

rZ^n, where they ef.ec, their endopeptidasa 

• chare a common architecture of a catalytic L-chain (LC. 
The Costrid* ^ -ding and transiting H-chain ,HC. 

oa 50 kDa> ^ „ consWere d to comprise a» o, part of two 

ca ,00 kDa). The HC po ypep« of ^ HC (ca 60 kDa,. 

dis * c, rr;o^:v 0 :^ 

,6rmSd,he ' ; i ^arecep,orsonthe,arga,neuron.w h i,st*earnino-term.nai 
neurotoxin to cel. surface rec P ^ 
-''•^^^rrtnlneu ot in across ce»u,ar memhranes such tnat the 

*— — I -s - iow PH. of .orming ion-permeahie ohannais 

« - - - - . , «~ 

■ ♦ o A (BoNT/A) these domains are considered to reside 
wnWn amino acid residues 8 ^ ^ degrades 

. — — **- - 
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which is no longer able to bind to and Pn t 

Produced also has , he property . bf ^ ™~ LH„ fragmem so 

nolotoxin and the isolated LC «™P«d to both the pa ren, 

" " there ' 0re P0SSiWe » •«»". 'undone, definitions of the „ 

neurotoxin molecule, as follows.- domains "rthin the 

W clostridial neurotoxin light chain: 

-a metalloprotease exhibiting high substrate specified fa, ■ 

membrene associated proteins involved in the ^ ' 

deavesoneormoreof SNAP-25 VAMP (S v eX ° CV, ° ,iC Pr ° CeSS ' ' n * 

» Clostridia, neurotoxin heavy chain H„ domain.- 
-a portion of the heavy chain which enables , , 

the target C el,. *"-■«*» receptor via the binding ^ 

-the domain responsible for formation „. - 

under conditions of ,„ w pH . °' mn ^^ Pores in ,ip W membranes 

-the domain responsible for increa*;™ + u 

<C> Clostridia, neurotoxin heevy chain H c domain, 
-a portion of the heavy chain which 



» responsible fo, binding of the 



native 
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nolotoxin .0 cell surface ***** invoked in the intoxicating action of clostridial 
toxin prior to internalisation of the toxin into the cell. 

The identity of the cellular recognition markers for these toxins is currently not 
Tn elod and no specific receptor species have ye, been identified afthough 
Toz i et a, have reported that synaptotagmin may be the receptor for bo,u„num 
type B. I, is probable .at each o, the neurotox s a different 

receptor. 

It is desirable to have positive controls for toxin assays, to develop clostridial toxin 
£Z and to develop therapeutic agents incorporating desirable propert.es o, 

clostridial toxin. 

However, due to its extreme toxicity, the handling of native toxin is hazardous. 
Thepresentinventionseekstoovercomeoratleastameliorateproblemsassociatad 
.w«h production and handling of clostridial toxin. 

„• olv the invention provides a polypeptide comprising first and second 
A let n 1 firs, domain is adapted to cleave one or more vesicle or 

d0ma ' n 2Z oc W edp,oteinses 5 en,.,l,oneu,ona,exocy,os fe andwherein 
p , asm a-memb an * J ^ mo ce „ or „ 

°Z 2 * *e polypeptide compared to the soluble of the fir, 
r „ on r s owto, « both to translocate the polypeptide into the cel, and to 
^r :« of the polypeptfde compared to the so^fty of *e « 
* s own said polypeptide being free of clostridial neurotoxin and free of 

ZTZZh. - invention may thus provide a singie ,po,pe,id ; 

• H nmai n equivalent to a clostridial toxin light cha.n and a doma.n 

iacking the functional aspects of a clostridial toxin H c doma,n. 
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7 7 PUrP ° SeS <* the *. function,, property or 

of a clostridial toxin heavy chain tha, are required t o h ' ^ °' ** H » 
domain of the po,y P ept,,e of the ^n, * *" 

polypeptide into a celi, or «, increasing so| / ,ra -'°°a.ion of .he 

;° ,ub ;7' the * s " o ™^^ow„or ( i,i )bot ; rr ideoompared, ° 

» a H N domain or to the functions of a H dom • """""nces hereafter 

or properties. The second domain is no, re^^h^T " ^ 

H. domain of a clostridial toxin heavy chain. P ' 0Pe,tieS of ,he 

A polypeptide of the invention can th b 

func fl onofana,ive,oxin-,hisis„, usein U p ro XI^r' ** 
or assisting to vaccinate an individual against cha, ^ 
embodiment of the invention described" "° " ^ ' ~°»> 

designated LH„ 3 ,A eiicited neutraiising antibodies^* t^ * P °'~ 
Polypeptide of the invention can likewise thus b. , "* "° U ""° Xln - A 

translocation function of a native toxTtlis f inS °' Ub,e -* *• 

composition made up of the, p„,y peptide and clTl^ " ^ * * 
one or more of said other components. components °y 

The firs, domain o, the po,y peptWe of tne , 

plesma-membrane associated proteins essentia, to ,he 7 " " ™* "** 
•xocytosis. and Ceavage of these „™ ^ "' C M " ular of 

The .ctrvity o, costndie, neurotoxins in inhibi« n g exoclsTs h a ' Ce " S - 
observed aimos, universally in euKaryotic ce„s expresZTl, ^ 
receptor, including such diverse cells as from A , ' SUrface 

«V) and mammalian nerve cells l d 1 ^ °' M ° P * *«* 

vc ceiis, and the activity of tho ** 

understood as including a corresponding range o, cells " " '° 

The polypeptide o, the invention may be obtained by expression nf 

nucl ,c acid, preferably a DNA and i. expression of a recombinant 

and is a single polypeptide. th a , is , 0 say not ■ 
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Ceaved into separate light and' heavy chain domains. The polypeptide is thus 
available in convenient and large quantities using recombinant techn.ques. 

,„ a polypeptide according to the invention, said firs, domain preferably comprises 

lostr die toxin light chain or a .ragmen, or variant o, a clostnd.a, «ox,n „g ht 

^ Th ragmen is optionally an N-terminal. o, C-termina, .ragmen, o, the „gh, 

ta': r s n'lrna, figment, so long as „ substantially re,ains the abilKy ,o 

T * vesicle or piasma-membrane associated protein essentia, to exocytos*. 

— necessary tor ,he activ«y o, ,he „gh, chain o, c,os,r*i 

are described in J. Biol. Chem., Vol.267. No. 21. July 1992. pages 14721- 

^ ant has a different peptide sequence from ,he „gh, chain or from 

. 1 ,hough I, ,oo is capaMe o, cleaving ,he vesicle or plasma-membrane 

" Z r IT < is convenientiy obtained by insertion, deletion and,or 
ass0 c,a,ed pro,«n. ^ of , he inventlon 

TZ££ZZ-*Z- comprises . an M-termina, extension ,0 a 
tstdW » nTigh, chain or .ragmen, « a Costridia, ,oxin light chain or fraqment 
H i by lion o, at ,eas, one amino acid W a Carmine, extension to a 
::*:::Ztchainor.ragmen, or ( iv, combinations of 2 or more o. 0, 

hodiments of the invention. the varian, contains an amino acid 

' n, "dr s r^ 

sequence mod tied so reg . Qn 

and irnrrsrr-:^-. * - - » - — t 

SP ^ a« *e ndopeptid.se activity o. the ligh, chain in a particular environmen, 
administration. 

• **r*hlv exhibits endopeptidase activity specific for a substrate 
Costridial toxin is preferably botulinum toxin or tetanus tox,n. 
In an embodimen, of the invention described in an example be,™, the toxin ,,h, 
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chain and the portion 0 f ,„ e toxin heavy chain are of hot r 
, a further embodiment of the invention Lcrib ed T ^ A " 

H9ht chain and the portion of , he toxin neavy ^ZT^ *° ** 

The polypeptide optionally comprises a lign, chain 0 , b ° ,U " nUm **> * 
'° Xln " ^ '~ or variant of °' " 

In a polypeptide according to the in>,.„.- 

y w me invention said second h~ . 
comprises a clostridial toxin heavy chain H d " Pre,erabl V 

Clostridia, toxin heavy chain H„ portion Thai " l '° g '"°"' "' Ua " am °< a 

or Carmine! or internal fragment so ',„„ """""^ N -« em *<" 

-main. Teachings o, regions L n 2 ? " - - * 

Provided for exampie in Biochemistry , 9 5 3 "" P ° nS "" e ** * « 
Blochem. 1989 . 1M . pages ^ and Eur. X 

•he H„ domain or fragment, though i, ,oo retains he f ' Se " UenCe 

It is conveniently obtained by insertion H„, " °' ** H « do ™'h- 

or fragment thereof. M.^IL^^"^-^*"* 

comprises (i, an N-termina, extension to a H d„ " ,V< " 1, " >n ' ' d ~* - ^ * 
extension to a H„ domain or ^ S^*^*'*™ 

^"--Iteration ofatieastoneamino i Z^Z* ' * 

- «i. The Costridia, toxin „ preferably ^^^^ 

The invention also provides a polyoeDtid* Pnm • • 
Chain and a N-termina, frag™ ZTZZT ' 

•ragmen, preferabVcompnaing a, ,eas,423t f t w neUr0,OX,n ^ ^ «» 
heavy chain of botuiinum ^^ITT^"'' "** °' 
heavy chain of botulinum toxin L B o th N - ,ermina ' amino «*•• •»*. 
amino acids of the heavy chain of o m I ^"'^ " *«""*»• 

— - — -~"crrr~— 
^.rrr-rnirr. 



PCT/GB97/02273 

WO 98/07864 . 7 - 

sig „mcan,ly reduced in an *, *o assay of neurotoxin endopeptidase activity. 

r ^her the polypeptides may be susceptible to be converted into a form exhibiting 
^ Wase a t, i ty b y the action o. a proteolytic agent, such as trypsin. In ,h,s 

I^T pt^ » - — activ,tv " ,he ,oxin "* chaln - 

, a specific embodiment of the invention described in an example be,ow there is 
ir polypeP<«e lacxing a portion designated H c of a Clostridia, tox.n heavy 
' Th s poln seen in the naturally produced toxin. is responsible for bind,ng 
C Hxin ceT rtac e receptors prior to internalisation o, the toxin. This specific 
! r an. therefore adapted so that it can not be converted into active tox,n. 
: — - ac.cn 0, a proteolytic en Z yme. The invent thus also 
de^ polypeptide comprising a Clostridia, toxin light chain and a fragment o, 
TTdL^TLy ohain, said fragment being no, capab,e of binding to those 
8 T t lep o s nvoived In the intoxicating action o, Costridia, toxin, and rt 

clostridial toxin heavy chain. 

, nfurt h.rembodimentsof,heinven,ionthe,eare provided compositions containing 
T Z. comprising a clostridial toxin ,„», chain and a portion des,gnated H„ 
8 ^ Pe C;°7 n h Z chain, and wherein the composition is free of Oostridia, 

„x,n and free of a y ^ Examples of thes 

botulinum toxin types A, B, C„ D, E, F and G. 

•„ nf the invention are conveniently adapted to bind to, or include, 
T T 7X^17:22, ce,„ The polypeptide optionaily comprises a 
. hgand fo «W* • imroun og,obu,in. A suitable sequence ,s 

si * — — — ,rom doma,n b - 

3 r ToroteinA Choice of immunoglobu,in specificity then de.erm,nes 

r:r 8 rr ::Cp-e - — — • — - 
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polypeptide comprises a non-clostridia, sequence the, „• , 
receptor. suitable sequences including insulin lik. „ k '° " SUrface 
»nds ,o*s specific receptor on P«JLt£^ T'' ^ 
sequence from the carboxy-terminus of oho|era ^ ' ' ' 4 a ™™ •* residue 
b,nd the cholera toxin B subuni, end thence to GM an * ^ * 

according to the invention thus, optionally furj 9an9 "° S ' dM - ^ Polypeptide 
adapted for binding of the polypeptide to a cell. C ° mPn ' 5eS ' *" d ° main 

In a second aspect the invention provides a fu,i„„ 

« . Polypeptide o, the invention as described 1^^^ ' ^ °' 
adapted for binding to a chromatography matrix s 3 MC ° nd P °' yPe,,,ide 

fusion protein using said chromatography matrix Th" en! "" e PUri "' Cati ° n °' 
polypeptide to be adepted to bind to T ° 0nVenie ""°r 'he second 

impure source, such as a ce„ ex.ec, or supeLTn" " 

One possible second purification polypeptide is giutathione S tra , 
and others wil, be apparent to a person of ski,, i„ J 3 *' 0 " 6 - 8 -"™ 5 '""* (GST), 
enable purification on a Cremator,,, k 9 ° h ° Sen s ° as <° 

techniques. *°™'°9raphy co,umn according , 0 conventional 

As noted above, by proteolytic treatment, fo, examole • 
polypeptide of the invention it is possible .„ • , US '" 9 trVRSin ' ° f a 

»ea,ed polypeptide. A third Is " o e h * e 

comprising a derlvaave of a closwdial t„v ' nVen,,0n P '° VidM ' c °"*><"«on 
of the endopeptidase activfcy " ^ reta ' nin8 " '°* 

non-toxic * viv0 due ,o „. i na bi % , ^ cT' , ^ "*« 

the composition , s free of any com '° Ce " SUrfa « '»=°"><ors. and wherein 

derivative. *a,is,ox,c,>,vrr:~' ^ " ^ " * — 
- o, nature, toxi, and ^^t^.T^^ 

least 60o/ 0 of natural toxjn The J ,east 3 % and most preferably at 
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While it is known to treat na'turally produced clostridial toxin to remove the H c 
domain this treatment does not totally remove toxicity of the preparation, instead 
some residual toxin activity remains. Natural toxin treated in this way is therefore 
s ,il, no, entirely safe. The composition of the invention, derived by treatment of 
a pure source of polypeptide advantageous* is free of toxicity, and can 
conveniently be used as a positive control in a toxin assay, as a vaccine aga,ns, 
dostridia, toxin or fo, other purposes where i, is essentia, that there ,s no res,dua, 
toxicity in the composition. 

Th e invention enables production o. the polypeptides and fusion proteins of the 
invention by recombinant means. 

A fourth aspect of the invention provides a nucleic acid encoding a polypeptide or 
fusion prlin according to any o, the aspects o, the invention described above. 

, n one embodiment of this aspect of the invention, a DN A sequence provided to 
.defer the polypeptide or fusion protein is no, derived from 
sequences, bu, is an artificially derived sequence not preexists ,n nature. 

A specific DNA JSEQ !D NO: 1, described in more detail below encodes a 
type A heavy chain. This recombinant product ,s designated LH 423 /A (SEQ ID 



2). 



ln a second embodiment o, this aspect o, the invention a DNA sequence wh,ch 

o es or - P°'VPeP<«= « *~" " "T T T 

sequences bu, codes for a poiypeptide or fusion protein no, found ,n nature. 

A specific DNA (SEQ ID NO, 19. described in more de,ail below encodes a 
^peptide or a fusion protein and comprises nucleotides encod.ng ,es,dues V 
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1 171 of a botulinum toxin type B Sa iH r. , 

domain an, the firs, 728 amino ^ *• Wn chain 

botulinum type B heavy chain This ,ecn m h- ,he am,no »™inal prat in o( a 

ID NO: 20). mb ' nan ' ProdUCtis ^Wa,edLH, a /B ( SEQ 

The invention thus also provides a meth d 

comprising expressing in a host cel. a DNA ac T"**" °< « Peptide 
-ntion. The host col, is su tably ^^Z^^ T" " - 
protein of the invention so as to sen ara ,. r . Polypeptide or f us i on 

a non-clostridia, host. '^^ M ^'-*«*--;fcr« n ^ 

The invention further provides a method 

comprising expressing in a host cell a DM*". ™ anU ' aC,ure °< » Polypeptide 
above, purifying .ft. fusion protein bv elu, 1 ' " *«*«' 

adapted to retain the fusion ro 1 ^ « «— graphy column 

a »o,nd adapted to disp.ce column 
Production of substantia.^ pure fusion pro J T reC ° Ve " n9 *" '"^ *«*• 
•usion protein is readHy Z^ZJZ T^™^ 
substantially pure form, as the seJ^ZTl " '"^ ^ " 
using the same type o, chromatography * ~ 

The LH N /A derived from dichain native + « • 

trypsin to remove the Cerminan/2 „, , h T di0M,i ° n *» 

1/2 of the heaw chair, u ^ 
of this domain effectively renders the toxin in • ^ ,0SS 

interaction with host target cells TWi* ' naCt ' Ve ^ bV prevent ing its 
may indicate a contaminating ' J^'^T^' ' ^ *^ "*h 
neurotoxin. ^ ° mSenS ' t,Ve ' form «f the whole type A 
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h „s,s produce a processed, active polypeptide it is no, a toxin. Endopeptidase 
activity of LH 423 /A, as assessed by the current in vitro peptide cleavage assay. » 
wholly dependent on activation of the recombinant molecule between residues 430 
and 454 by trypsin. Other proteolytic enzymes that cleave between these two 
residues are generally also suitable for activation of the recombinant molecule 
Trypsin cleaves the peptide bond C-terminal to Arginine or Oerminal ,0 Lysine and 
is suitable as these residues are found in the 430-454 region and are exposed (see 

Fig. 12). 

The recombinant polypeptides of the invention are potential therapeutic agents for 
targeting to cells expressing the relevant substrate .but which are no, implicated ,n 
effecting botulism. An example might be where secretion o, neurotransmitter . 
inappropriate or undesirable or alternatively where a neurone, cell is hyperactive ,n 
JL of regulated secretion of substances other than neurotransmitter. In such an 
example the function of the H c domain of the native toxin eou,d be replaced by an 
amative targeting sequence providing, for example, a cel, receptor ligand and,or 
translocation domain. 

One application of ,he recombinant polypeptides of the invention will be as a 
reagent component for synthesis of therapeutic mo,ecu,es, such as disclosed ,n 
WO A-94/21 300. The recombinant product will also find application as a non-,ox,c 
Indard for the assessment and deve,opment o, * v«ro assays for detectiono 
f rnc,iona,botu 1 inumor,etenus neurotoxins efmer in foodstuff s or ,n environmental 
samples, for example as disclosed in EP-A-07631 31 . 

A further option is addition, to the C-terminal end of a polypeptide of the invention 
J, peptide sequence which ailows specific chemical conjugation to target 
Hgands of both protein and non-protein origin. 

, n V et a further embodiment an arternative targeting ligand is added to *e N- 
erminus of polypeptides of the invention. Recombinan, LH„ derivatives have been 
e ha a e specific profease cleavage si.es engineered a„he Cerminus 
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of the LC a, the putative trypsin sensitivfj 

terminus of , he complete protein product Thes, , " *' S " Veme C " 

specificity o, the recombinant product such tH^Tl^™***™™™' 
activated by proteolytic cleayage of a m„,» ., SPMieS Can onl V •» 

more pr ed , c ,eb,e nature than use of trypsin. 

TheLH„en 2 ymatical,yp roduC e dfromnative . 
thus the recombinant form with its total d 'mmunogen and 

represents a vaccine component The ,S7° ^ ** ' en9 * h """""n 
reagent for creating defined P 1TS, " "'""^ ^ ~" " ' *« 

m0d '" 0a,IOnS in «««t o, any o, the abwe 

Recombinant constructs are assigned distin • „• 

amino acid sequence length and their Ugh. Chi 7 H™' °" *" baS ' S °' *"* 
chain. H) content as these re,a,e to ^nL^^ °^ «" 

or specifically to SEQ ID NO: 2 and SEQ , D 

followed by W where 'X' denotes *. ' LH ' desi 9"«ion is 

r , e,. . A . , or ^r^rr:^----.. 

Sequence variants from that of the native toxin tet3nUS t0X,n ' 

in standard format, namely the resid, ,« ° X ' n P °' VPeptide are 9 iven parenthesis 

- - — - — bv,he ' 

Subscript number prefixes indicate an » • 

suffixes indicate a carboxy terminal ,c , S,m,,ari V- Script number 

-bersareused.adelej " *- 

s*es are indicated using abbreviations en f lns erts such as protease cleavage 
chain Ctermina, suffixes and H-chain li T T " » FXa ' L " 

«~ the predicted iun.cn ^ " "~ * 3 ' * 

engineered ligand sequences are prefix^ « «• Abbreviations for 

- =~ng to their J^TZ ^ ^ ' * 

Following this nomenclature. 
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2 L FXa/2 H 423^ A 



PCT/GB97/02273 

- 13 - 

SEQ' lD NO: 2, containing the entire L-chain and 423 
amino acids of the H-chain of botulinum neurotoxin type 
A; 

a variant of this molecule, containing a two amino acid 
extension to the N-terminus of the L-chain; 

a further variant in which the molecule contains a two 
amino acid extension on the N-terminus of both the L- 
chain and the H-chain; 

a further variant containing a two amino acid extension 
to the N-terminus of the L-chain, and a Factor Xa 
cleavage sequence at the C-terminus of the L-chain 
which, after cleavage of the molecule with Factor Xa 
leaves a two amino acid N-terminal extension to the H- 
chain component; and 

L H 423 /A-IGF-1 = a variant of this molecule which has a further C-terminal 
2 FX8/2 4 " extension to the H-chain, in this example the insulin-like 

growth factor 1 (IGF-1) sequence. 

There now follows description of specific embodiments of the invention, illustrated 
by drawings in which: 

Fjg ! shows a schematic representation of the domain 

structure of botulinum neurotoxin type A (BoNT/A); 



shows a schematic representation of assembly of the gene 
an embodiment of the invention designated LH 423 /A; 
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Fig. 3 



Fig. 5 



Fig. 6 



Fig. 7 



Fig. 8 



Fig.9 
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is a graph comparing activity of native tovin ♦ 
"native" LH /A *n* ' trVPS ' n 9*™*** 

2 LH 423 /A (Q 2 E,N 26 K, A27Y) in an in vitro peptide ^ 
,s a comparison of the first 

nenTS of the invention; 
shows the transition region of an embodiment of the 

:::: n desi9na,ed ^ 

endonuciease Ceavage site are marked and the H 
sequence then begins ALN...; 

r cna,a mi „oacidsooded,ora,.e N lr,: u srr: 

n win be cysteine; 

shows the Ctermina. portion of tn8 amjno ^ 

embod,mem of the invention designated ^ oH /A ,GF , 
fusion protein- the irf i **wH <j3 /a.|GF-1 . a 

ten. the IGF-1 seauence begjns n ^ 

-»*.*. Cermina, poruon of the amino acid sequence of an 
em o ,me„, of th e Mention designated ^-C^ , 

;:;rc ,he c - ,erm,na ' cm - 

snows the Ctermina, portion of the amino acid sequence ofan 
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embodimerit of the invention designated L FXa/3 H 423 /A-ZZ, a 
fusion protein; the C-terminal ZZ sequence begins at position 
A 890 immediately after a genenase recognition site (underlined); 

show schematic representations of manipulations of 
Figs 10 & 11 polypeptides of the invention; Fig. 10 shows LH 423 /A 
with N-terminal addition of an affinity purification 
peptide (in this case GST) and C-terminal addition of an 
|g binding domain; protease cleavage sites RVR2 and 
R3 enable selective enzymatic separation of domains; 
Fig. 11 shows specific examples of protease cleavage 
sites R1 , R2 and R3 and a C-terminal fusion peptide 
sequence; 



Fig. 12 



shows the trypsin sensitive activation region of a 
polypeptide of the invention; 



pj 13 shows Western blot analysis of recombinant LH 107 /B 

' 9 ' expressed from E.coli; panel A was probed with anti- 

BoNT/B antiserum; Lane 1, molecular weight standards; 
lanes 2 & 3, native BoNT/B; lane 4, immunopurified 
LH 107 /B; panel B was probed with anti-T7 peptide tag 
antiserum; lane 1, molecular weight standards; lanes 2 
& 3, positive control E.coli T7 expression; lane 4 
immunopurified LH 10 7^ B - 

The sequence listing that accompanies .his appiication contains the .Cowing 

sequences:- 

SEQJDNO: Seguence 

DNA coding for LH 423 /A 

1 
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LH 423 /A 
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DMA coding for 23 LH 423 /A (Q aE(N K A 

N-tPrmir, Q i *"vm„y), of which an 

4 N term,nal P° rt '°n is shown in Fig. 4 . 

5 2 3 LH 423 /A (Q 2 E,N 26 K,A 27 Y) 

DNA coding for 2 LH 423 /A (Q 2 E,N 26 K A Y) of k- . 
tprm ; nal 2 26 ' x ' M 27TJ,ofwhlchanN- 

6 term,nal P° rt '°n is shown in Fig. 4 

2 LH 423 /A (Q 2 E,N 26 K,A 27 Y) 



7 
8 

9 W ° NA Codin 9 *>r L«H 423 /A 

L /4H 423 /A 



1 1 
12 
13 
14 
15 

16 

17 

18 



19 

20 

21 

22 

23 

24 

25 

26 

27 



coding for „ ative BoNT/A according ,„ Bin2 st „ 
native BoMT/A according to Binz et al 



DNA coding for W,H 423 /A 

DNA coding for L FXa / 3 H 423 /A-IGF- 7 
WsH^/A-IGF-l 

DNA coding for W 3 H 423 /A-CtxAl4 

L FXa/ 3 H 42 3/A-CtxA 1 4 

DNA coding for L FXa/3 H 423 / A -22 

LfXa^^jj/A-ZZ 

DNA coding for LH 72e /B 
LH 728 /B 

DNA coding for LH 417 /B 
LH 417 /B 

DNA coding for LH I07 /B 
LH 107 /B 

DNA coding for LH 423 /A (Q 2 E,N 26 K,A 27 Y> 
LH 423 /A (Q 2 E,N 26 K,A 27 Y) 

DNA coding for LH 417 / B wherein the first 274 bases are 
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modified to have an E.coli codon bias 
DNA coding for LH 417 /B wherein bases 691-1641 of the 
native BoNT/B sequence have been replaced by a 
degenerate DNA coding for amino acid residues 231-547 
of the native BoNT/B polypeptide 



Pvamole 1_ 



A 261 6 bese pair, double stranded 9 =ns sequence (SEO ID NO: 1. has bear, 
assembled from a combination of synthetic, chromosomal and 

o'merase-chain-raadon 9 en=rated DNA (Figure 2,. The gene codes .or a 
peptide of 87, amino acid residues corresponding ,o the entire „ 9 ht-cha,n LC 

I amino acids, and 423 residues o, the amino terminus of the heavv-cha,n <H , 
of botu,inumneuro.oxintypeA.This,ecombinant P roduct,sdes,gna,edtheUH 4!s ,A 

fragment (SEQ ID NO: 2). 

fnnrr .^W, of Th- — "h™"' orodUCt 

The first 918 base pairs of the recombinant gene were syntheslsed by 
lnca.ena.ion of short oligonucleotides to generate a coding sequence with an E 
c bias. Both DNA strands in .his region were comply synthesrsed as 
overling oligonucleotides which were phosphoryiatad. annealed and 
, d leaner a!e «"° « ^nthetic region ending with a unique res.nct.on 
TZ ZZ o, the LH 4J3 /A coding sequence was PCB amplified from total 
chromosomal DNA from C,o— and anneaied to the synthefc 

portion of the gene. 

The interna, PCR amplified product sequences were then deleted and replaced W«n 
1 «i« fu„y sequenced, regions from clones of C. *«*». chromosomal 
t oenete Z final gene cons,™, The final composition is synthetic DNA 
ong,n to generate the. 197 6-2616) and 

(bases 1-91 3). polymerase amplmed DNA (bases 914 1 
tne remainder is of C. bo,ur,num chromosome, ong.n (bases 1139-1 WW. 
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9e " e Was ,he " <"<<y »iuenced and cloned in, 
plasmid vectors for expression anaiysis. ' °' £ «* 

The DNA is expressed in £ co// as a s|n , 

soluble singie chain polypeptide of 99 95 TaJ"^ ^''^ """""^ a 
gene is currently expressed in B co„ as , ! ^"""^ht.The 
coding sentence of g luta ,hione S-transferase (GsT, If Tl^™™"* «"■*• 
any of an extensive range of recombinant o.„. „v SC "' S '° S °™* , '>°'»°<"» but 

in other proxaryotic or euxaryotic Hosts ucb a" e Gr ! "** 
Currently, £. co // harbouring the e yn™«- 

- ,,.0. ph ,0. con,:: ::~ 0 ~: : r n in 

and 1 0 9/1 sodium chloride) at 37- c una, ,h„ 7T 9 b=cto-y.a s , extract 
absorbance of 0.4- 0.6 a, 600 nm anTl c <**- 
Phase. Expression of the gen °* " * ***** fl ro»vth 

isopropyfthio-^D-ga.actosidase OPTW J .T , ' ndU ° ed " V """^ °< 
Recombinant gene expression is allowed t „ ^SZZ? " " ^ 
temperature of 25°C The cell* u at a reduced 

HOTA. 0. 25% Tween , pH ? . 0 ^ ^ I 1^°™ ""'- 'OmM 

recombinant protein the cells arejisrupted bv . ■ °" °' ,ne 

cleared 0, debris by c^J^^T" 

-ubie recombinant fusfcn p'rotein ^T,^ ^"7" ^ ^ 

~n. A propor,ionofrecombi„a mm a,,r:^ 
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The recombinant GST- LH 423 /A is purified by adsorption onto a commercially 
prepar d affinity matrix of glutathione Sepharose and subsequent edition with 
reduced glutathione. The GST affinity purification marker is then removed by 
proteolytic cleavage and reabsorption to glutathione Sepharose; recombinant 
LH 423 /A is recovered in the non-adsorbed material. 

Cnr\W* variants 

A variant of the molecule, LH 423 /A (Q 2 E,N 26 K,A 27 Y> (SEQ ID NO: 26) has been 
produced in which three amino acid residues have been modified within the light 
chain of LH 423 /A producing a polypeptide containing a light chain sequence different 
to that of the published amino acid sequence of the light chain of BoNT/A . 

Two further variants of the gene sequence that have been expressed and the 
corresponding products purified are 23 LH 423 /A <Q 2 E.N 26 K.A 27 Y> (SEQ ID NO: 4, 
which has a 23 amino acid N-terminal extension as compared to the predicted 
native L-chain of BoNT/A and 2 LH 423 /A (Q 2 E,N 26 K.A 27 Y) (SEQ ID NO: 6) which has 
a 2 amino acid N-terminal extension (Figure 4). 

,„ yet another variant a gene has been produced which contains a Eco 47 ... 
restriction site between nuc.eotides 1 344 and 1 345 of the gene sequence given .n 
(SEQ ID NO- 1). This modification provides a restriction site at the posit.on m the 
gene representing the interface of the heavy and light chains in native neurotoxin, 
and provides the capability to make insertions at this point using standard 
restriction enzyme methodologies known to those skilled in the art. It will a.so be 
obvious to those skilled in the art that any one of a number of restriction srtes 
could be so employed, and that the Eco 47 III insertion simply exemplifies th» 
approach. Similarly, It would be obvious for one skilled in the art that insertion of 
a restriction site in the manner described could be performed on any gene of the 
invention The gene described, when expressed, codes for a polypept.de, L /4 H 423 /A 
( SEQ ID NO: 10), which contains an additional four amino acids between ammo 
acids 448 and 449 of LH 423 /A at a position equiva.ent to the amino terminus of the 
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A variant of the gene has been expressed L w /a 

a specific proteose cleavage "' "> *"> 

site incorporated was for Factor Xa protease and was coJ^J^"^ 
SEQ ID HO: 1 . „ wil , be app„ent to one s Me d in the J h ^ , " 

seance coding for the required cleavage s ft e could b e employed M 7r ^ 

of .he gene sequence in this manner ,o code for a def 7 

performed on any gene o, the invention ^ b * 

Variants of L^H^/A have been constructed in which a thirds ■ ■ 

=t the carboxy-termina, end of the polypeptide w ch " """" 

binding activfty into the polypeptide. nC ° rB ° ra,eS ° S ™* 

Specific examples described are: 

HJ LFx a /3H 423 /A-IGF-1 (SEQ ID NO: 14) in w,hi>h #k 

aaequenceegulvalenttothatofinal;:^ 

*nd to the insulin,, growth factor receptor with " ^ 

' 2 > L FXaA}H 423 /A-CtxA14 (SEQ ID NO- 16) in ...k- u u 

p-^.^^^T 1 '- 

of the A-subuni, of cholera toxin (CtxA) Sl^"^^ 
cholera toxin B-subuni, " ' " * ^ 

.3. WW«Z ISEO ,0 N 0: ,« . m which the carboxy-termina, domain Is a 
•andem repeat,ng synthetic .gG binding domain. This variant also exemp,! 

8 ne o, a sequence coding for a protease cieavage site located between ,h e!d 
of the clostnd.a, heavy chain sequence and the sequence coding ,o, the bind" 
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iigand Specifically in this example a sequence is inserted at nucleotides 2650 to 
2666 coding for a genenase cleavage site. Expression of this gene produces a 
polypeptide which has the desired protease sensitivity at the interface between the 
domain providing H N function and the binding domain. Such a modification enables 
selective removal of the C-terminal binding domain by treatment of the polypept.de 
with the relevant protease. 

,t will be apparent that any one of a number of such binding domains could be 
incorporated into the polypeptide sequences of this invention and that the above 
examples are merely to exemp.ify the concept. Similarly, such binding domains can 
be incorporated into any of the polypeptide sequences that are the bas.s of th. 
invention Further, it should be noted that such binding domains cou.d be 
incorporated at any appropriate location within the polypeptide molecules of the 
invention. 

Further embodiments of ,he invention are thus iUustrated by a DNA of the invention 
further comprising a desired restriction endonuclease site a, a desired iocation and ; 
bv a polypeptide of the invention further comprising a desired protease cleavage 
site at a desired location. 

The restriction endonuciease site may be introduced so as to faciiitate further 
manipuiation of the DNA in manufacture of an expression vector fo, express,n 9 a 
poiypeptide of the invention; it may be introduced as a consequence of a prev,ous 
L in manufacture of the DNA; it may be introduced by way of modifies,™ by 
insertion, substitution or deletion o, a known sequence. The oonseouence of 
modification of the DNA may be that the amino acid sequence is unchanged, or 
my be that the amino acid sequence is changed, for example resulting ,n 
introduction of a desired protease cleavage site, either way the polypeptide retains 
its first and second domains having the properties required by the invention. 

Figure 1 0 is a diagrammatic representation of an expression product exemplifying 
features described in this example. Specifically, it illustrates a single polypept.de 
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incorporating a domain equivalent to the linht „h • , . 

A and a domain eq uiva,en, t0 the ^ do Cha '" °' "«*<■» neurotoxin type 

to the H N domain of the heavy chain nf k * ■• 

separated by specific protease ° ^< * ate spatial 

separation of d om ains as exempt t ~ 

specifically depioted in Figure 11 wheretho ^ «nce P , i s more 

defined for the purpose of exampfc """" ~» « 

Assay of prod uct activitY 

The LC of botulinum neurotoxin type a exert<1 , , 

activity on , he synaptic ves.ee essocial p o^i Z ~ 

specific manner at a single pep*de bond. L 2 T<a £ T* '! " ' 

NO: 6) cleaves a synthetic SNAP-2S substrat. ' Q < E ' N » K - A »Y> (SEQ ID 

=s the native toxin ,F i9 ur. 3, Thu '" *°° ^ *' — ™- 

o<^ A ^, N J A „ Y ;2 ^ ^ n °;;— »epo,ypep, id ese q uence 

functional LC domains does no, p e« ** 

prevent the funct,onal activfty of the LC domains. 

This activity is dependent on proteolytic modification of ,h 
<3ST- !L H 423 /A,Q ; E. Nj( K.A 2 ,y, t0 convert the sje h n 

a disulphide .inked dichain species Thi* »°'YPep>ide product to 

trypsin. The Tt ^ **** 

form^n • roau ct (100-600 //g/ m |) is incubated at 37 °C 

for 10-50 minutes with trypsin (10iyn/miim» . - 
« - .«* -0 m M ^ 

terminated by addition of a 100-fold m n ,=, reaC "° n is 

ac«va,ionby^ psin8 e n era,esa l ir ;r:° f ^ ^ ^ 
bypoiyacrylamidegelelectrophore ' d k ^ " 



2 LH 423 /A is more stabl in the presence of trypsin and 



more active in the in vitro 
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peptide cleavage assay than'is 23 LH 423 /A. Both variants, however, are fully 
functional in the in vitro peptide cleavage assay. This demonstrates that the 
recombinant molecule will tolerate N-terminal amino acid extensions and this may 
be expanded to other chemical or organic moieties as would be obvious to those 
skilled in the art. 

Example 2 

As a further exemplification of this invention a number of gene sequences have 
been assembled coding for polypeptides corresponding to the entire light-chain and 
varying numbers of residues from the amino terminal end of the heavy chain of 
botulinum neurotoxin type B. In this exemplification of the disclosure the gene 
sequences assembled were obtained from a combination of chromosomal and 
polymerase-chain-reaction generated DNA, and therefore have the nucleotide 
sequence of the equivalent regions of the natural genes, thus exemplifying the 
principle that the substance of this disclosure can be based upon natural as well 
as a synthetic gene sequences. 

The gene sequences relating to this example were all assembled and expressed 
using methodologies as detailed in Sambrook J, Fritsch E F & Maniatis T (1989) 
Molecular Coning: A Laboratory Manual (2nd Edition), Ford N, Nolan C, Ferguson 
M & Ockler M (eds), Cold Spring Harbor Laboratory Press, New York, and known 
to those skilled in the art. 

A_gene has been assembled coding for a polypeptide of 1171 amino acids 
corresponding to the entire light-chain (443 amino acids) and 728 residues from the 
amino terminus of the heavy chain of neurotoxin type B. Expression of this gene 
produces a polypeptide, LH 728 /B (SEQ ID NO: 20), which lacks the specific neuronal 
binding activity of full length BoNT/B. 

A gene has also been assembled coding for a variant polypeptide, LH 417 /B (SEQ ID 
MO: 22), which possesses an amino acid sequence at its carboxy termmus 
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equivalent by amino acid homology to that at the carboxv t, • 

chain fragment in native LH N /A . car b°xy-term.nus of the heavy 

A gene has also been assembled coding for a variant „ , 



Construct Variants 

A variant of the coding sequence for the first 274 k 

ID NO: 21 has been produced which wh s b ^ ^ * SE ° 

sti,l codes for the native po,ypepL e ' ' *~ 

Two double stranded, a 268 base pair and a 95 1 h a « • 

b-n ^ using an overlapp J ~ ^ncesnave 

»ese stances was de s, g „e d t0 Ha„e an , coll Z * 

2^rz:irrr ~ 9 ,h ° - ^ - 

For ,he second sentence 23 To , ' 0> " n 8 "" ~< Sed - 

nuc,eot W es 69,-^ J n ^ " »*~ 

ove riaPPinfl re3i on S . M nuclMti ; es . :::z^tz:t 

temperatures in the range 52-560C InaHHif 9 meltm9 

5 synthet.c sequence has been incorporated into th. u P 

21 in ni a ~> * i. • ■ orporated into the gene shown in SEQ ID NO- 

21 in place of the original first 268 bases /an* ic ,k • « 
c- •, , . oases (and is shown in SEQ ID NO- 271 

S,™, riV ,ne sendee c 0uW „e inserted in,o oth e r oenes o, ,ne J Tpies 

Ano«,e r variant seouence e^aien, ,o nuclides 69, ,„ , 64, o, SEQ , D N0: 21 
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and employing non-native cbdon usage whilst coding for a native polypeptide 
'sequence has been constructed using the internal synthetic sequence. This 
sequence (SEQ ID NO: 28) can be incorporated, alone or in combination with other 
variant sequences, in place of the equivalent coding sequence in any of the genes 
of the example. 

Example 3 

An exemplification of the utility of this invention is as a non-toxic and effective 
immunogen. The non-toxic nature o, the recombinant, single chain materia, was 
demonstrated by intraperitoneal administration in m,ce of QST-,LH„ 3 /A. The 
p „,ypap,ide was prepared and purified as described above. The amount o 
immunoreactive materia, in the final preparation was determined by enzyme l.nked 
immunosorbentassay.aiSWusingamonocionalantibody.BAn.reactiveagams 

conformation dependent epitope on the native LH N /A. The recombinant matena, 

: ; :x -J- * — — <™ *° a 8 9 "'. kci °: 2 t 9/ 

Na HPO. 1.16 g/l. KH,PO, 0.2 g/l. PH 7.4) and 0.5 ml volumes ,n,ected ,n,o 
" ; ups 4 mice such that each group o, mice received 10. 5 and 1 micrograms 
o, materia, respectively. Mice were observed for 4 days and no deaths were seen. 

Fo r immunisation. 20 « of GST.LrWA in a 1.0 m, vo,ume o, 
emu,sion ,1:1 vo,:vol, using Freund's comp,ete (primary Necons on,y. o, Freund s 
Impute adjuvant was administered into guinea pigs via two subcutaneous 
dorsa, injections. Three injections at 10 day interval were g,ven day I d 10 
and day 20, and antiserum co„ected on day 30. The antisera were shown by ISA 
0 be mmunoreactive against native botuHnum neurotoxin type A and o «s 
Live LfVA. Antisera which were botulinum neurotoxin reactive a, a d.lution 
a 1-2000 were used for evaluation of n«u«a,ising efficacy ,n m,ce. For 
uualisation assays 0, m, o, antiserum was diluted into ,2 S m, of gelatine 
pnosphate buffer (GPB; Na,HPO. anhydrous 10 g/l. gelatin 
6 6, ontaining a dilution range from 0.6„g (6X10" 0) » 5 programs ,5X10 g) 
A, u" , 0 5 m, were injected into mice intraperitoneal and deaths recorded 



WO 98/07864 

. 26 . p CT/GB97/02273 

over a 4 day period. The re SU , ts are shoW n in Table 

be seen that 0.5 mi of 1 :40 diluted anti- GST LH 1 * ^ 

against intraperitoneal challenge with hot, r * ant,s ^m can protect mice 

n. CI - ^,000 mouse L O50 ; 7^^^^^^ 
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j^BLEi Neutralisation of botulinum neurotoxin in mice by guinea pig 
anti-GST- 2 LH 423 /A antiserum. 

Botulinum Toxin/mouse 

Survivors 0.*g 0.005„g 0.0005„g O.Bng O.OOBng 5pg Contro. 
° (no toxin) 

On Day 

4 4 4 4 4 4 

2 . 4 4 4 4 4 4 

4 4 4 

444 



j^BLEX Neutralisation of botulinum neurotoxin in mice by non-immune guinea 
pig antiserum. 

Rotulinum Toxin/mouse 
^ W, 0.00** 0.000*, M, 0.00., *, «-r- 



On Day 

1 

2 

3 " 
4 



2 4 
0 4 
4 
4 



Example 4 

Expression of recombinant LH 107 /B in E. coli. 



As an exemplification of the expression of a nucleic acid coding for a LH of a 
Costridia, neurotoxin of a serotype other than botulinum neurotox type A t 

nucleic acid sequence (SEQ ID NO: 23) coding for the polypept.de LH 107 /B (SEQ 
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England BioLebs. Beverley, MA, USA, as Z ""'^ '° E3) «« 

'-ion peptide. under 1PTG induc , (on ™* = N-termina, T7 

we, harvested and recombinant prote , n Z7" ? ^ °*~ 

LH 423 /A. XTraCt6d as descnbed previously f 0r 

Recombinant protein was recovered and f 

» mU no affl „ ity adsorptten t0 an immob ^^ T ^ a ; ,eria ' pas,e lvsates by 

using a T7 tag purific a tion kit (New Endand hi„, k m °n°donal antibody 

recombinant protein was analysed h Bwe 'ley. MA, USAI. Purified 

analysed by gradient (4-20»/i a 
polyacrylamide gel eleotrophoresis (Novex Sen D' dena <u"n 9 SDS- 

biotting using p„,y olona , am , botu , inum ^J^^ *"* «— "■ 
.nfserum. Western biotting reegents were from Jex imm " an "" T7 

were visualised using the Enhanced cw- , ,TO ™"°st a ined proteins 

AmersHer, The expression of an an,i-T7 e^dy d""" ^ ^ 
«VPe B antiserum reecuve recombinant 1^^*°^ ~" 

proauct is demonstrated in Figure 13. 

The recombinant product was <!ni.,M~ 

sponsible for endopeptidaTe aX ~ *" W °' «" '* h < ** 
The invention thus provides r*n nr ~v 

~ge„s. en 2 yme stan d ; a ;dtr ""'^ — *» * - 
described in WO-A-94/2,300 * "" ""^ °< » 
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SEQUENCE LISTING 



(1) 



GENERAL INFORMATION: 



(i) 



A ^ P ^S^ : . MICROBIOLOGICAL RESEARCH AUTHORITY 

B STREET^ Centre For Applied Microbiology And Research. 
v Porton Down 

(C) CITY: Salisbury 

(D) STATE: Wiltshire 

(E) COUNTRY: UK 

(F) POSTAL CODE (ZIP) : SP4 OJG 

(A) NAME: THE SPEYWOOD LABORATORY LIMITED 

(B) STREET: 14 Kensington Square 

(C) CITY: London 
(E) COUNTRY: UK 

iF) POSTAL CODE (ZIP) : W8 5HH 

S! ^^ie^lSS- Microbiology And Research, 
Porton Down 

(C) CITY: Salisbury 

(D) STATE: Wiltshire 

S! pSSfiS (an . 

v Porton Down 

(C) CITY: Salisbury 

(D) STATE: Wiltshire 

',?! S« .(HP). 

v Porton Down 

(C) CITY: Salisbury 

(D) STATE: Wiltshire 

gi'SKoS (Zm= SP4 OJG 
{ii ) TITLE OF INVENTION : Recombinant Toxin Fragments 
(iii) NUMBER OF SEQUENCES: 28 

B COMPUTER : IBM PC compatible 

S ^RATING SY STEM: PC-DOS/f ; DO S 0 (Ep0) 

(D) SOFTWARE: Patentln Release si-u, 

(2) INFORMATION FOR SEQ ID NO: 1: 

H) SEQUENCE CHARACTERISTICS: 
U) (A) LENGTH: 26X6 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



48 



96 
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(ix) FEATURE: 

(A) NAME /KEY* CDS 

(B) LOCATION: 1. .26i 6 

(xl) SEQUENCE DESCRIPTION: SEQ ID NO ■ l • 
GTT GAC ATT GCC TAC ATr 15 

- - a. ^ g -J S g j- g . « s ^ g 

GTG AAG GCT TTC AAG ATT n*^ 30 

~ - «s - - S g - s - - « „ ffi . 

GAT ACA TTT ACG AAC CCr r-7v„ ^ 4$ 

^ - - ^ S £ SK - g s « „ „ „ 

60 

GCA AAG CAG GTG CCA GTT Tra 

«. «. ™ SI £ g « Asp e g g ?» «* ~» 

70 75 nr ^ Leu Ser Thr 

GAC AAC GAG AAG GAT AAC Tar- ^ * 80 

- «. ^ £ g »; « gy g «» „ s ^ 

GTT ATT GAC ACT AAr Trn 125 

~ a - * g g S S B g S - g g g g 

ATC CAG TTT GAG TGC AAR arn ^ 160 

MsgessagsgssBSB 

S = S = = = = = SBggggl- a 

=sssggga-- s -' sss 



2 35 

240 



144 
192 
240 
268 
336 
384 
432 
480 
528 
576 
€24 
672 
720 
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„ ntr rrr ACC AAC GCC TAC TAC GAG ATG AGT GGT TTA 

« !S JE S g » * »s» «. 3; Tyr «. Met S.r 01, Leu 

t s ssssssssesssss i s s js 

phe lie Asp 280 2Bb 

275 

^ nn* tvpt ACA CTG AAC AAG GCT AAG TCC ATT GTG 
AAG TTT AAA GAT ATT GCA AGT ACA CTG AA^ ^ ^ ^ ^ ^ 

Lys Phe Lys Asp He Aia »« 30Q 

290 

r^Tvr* tat aTC AAA AAT GTT TTT AAA GAG AAA 

S S S S E Si 35 S £ £ £. - «. g. 

305 310 

^ ™* PAT ACA TCT GGA AAA TTT TCG GTA GAT AAA TTA 
TAT ^ C IS ^ Sp !£ sS Sly Lys Phe Ser Val Asp Lys Leu 

Tyr Leu Leu Ser biu »v 33Q 335 

355 

- '«» »SS2 55 S iS S S3 S S5 S - ~ 

Phe Asp Lys Ala Val Pne uy^ 3QQ 

^ ttt AAT TTA AGA AAT ACA AAT TTA GCA GCA AAC 

SSSSSSES2 2.** — 3* — -.-o.ia.jjj 

385 390 
405 

n7 ,j> — r TAT aAG TTG CTA TGT GTA AGA 
_ arT m tca TTA GAT AAA GGA TAC AAT AAG 

gssssssg s - ~ - a «- -* s 

435 

«. ttt ATC AAA GTT AAT AAT TGG GAC TTG TTT TTT 

GCA TTA AAT GAT TTA TGT ATC AAA GTT ^ ^ ^ ^ phe phe 

Ala Leu Asn Asp Leu Cys lie uy* ^ Q 
450 

465 470 

SSSSSSSSSSSSS2S2 

-2222SS2S2 = = 5b s!S 



768 



816 



864 



912 



960 



1008 



1056 



1104 



1152 



1200 



1248 



1296 



1344 



1392 



1440 



1488 



1536 
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GAA AAT ATT TCA ATA GAA AAT CTT TCA AGT GAC ATT ATA GGC CAA TTA 1584 
Glu Asn He Ser He Glu Asn Leu Ser Ser Asp He lie Gly Gin Leu 
515 520 525 

GAA CTT ATG CCT AAT ATA GAA AGA TTT CCT AAT GGA AAA AAG TAT GAG 1632 
Glu Leu Met Pro Asn He Glu Arg Phe Pro Asn Gly Lys Lys Tyr Glu 
530 535 540 

TTA GAT AAA TAT ACT ATG TTC CAT TAT CTT CGT GCT CAA GAA TTT GAA 1680 
Leu astd Lvs Tyr Thr Met Phe His Tyr Leu Arg Ala Gin Glu Phe Glu 
545 * 550 555 560 

CAT GGT AAA TCT AGG ATT GCT TTA ACA AAT TCT GTT AAC GAA GCA TTA 1728 
His Glv Lvs Ser Arg He Ala Leu Thr Asn Ser Val Asn Glu Ala Leu 
* * 565 570 575 

TTA AAT CCT AGT CGT GTT TAT ACA TTT TTT TCT TCA GAC TAT GTA AAG 1776 
Leu Asn Pro Ser Arg Val Tyr Thr Phe Phe Ser Ser Asp Tyr Val Lys 
580 585 590 

AAA GTT AAT AAA GCT ACG GAG GCA GCT ATG TTT TTA GGC TGG GTA GAA 1824 
Lvs Val Asn Lys Ala Thr Glu Ala Ala Met Phe Leu Gly Trp Val Glu 
1 595 600 605 

CAA TTA GTA TAT GAT TTT ACC GAT GAA ACT AGC GAA GTA AGT ACT ACG 1872 
Gin Leu Val Tyr Asp Phe Thr Asp Glu Thr Ser Glu Val Ser Thr Thr 
610 6X5 620 

GAT AAA ATT GCG GAT ATA ACT ATA ATT ATT CCA TAT ATA GGA CCT GCT 1920 
Asn Lys He Ala Asp He Thr lie He lie Pro Tyr He Gly Pro Ala 
625 630 635 640 

TTA AAT ATA GGT AAT ATG TTA TAT AAA GAT GAT TTT GTA GGT GCT TTA 1968 
Leu Asn He Gly Asn Met Leu Tyr Lys Asp Asp Phe Val Gly Ala Leu 
645 650 655 

ATA TTT TCA GGA GCT GTT ATT CTG TTA GAA TTT ATA CCA GAG ATT GCA 2016 
He Phe Ser Gly Ala Val He Leu Leu Glu Phe lie Pro Glu He Ala 
660 665 670 

ATA CCT GTA TTA GGT ACT TTT GCA CTT GTA TCA TAT ATT GCG AAT AAG 2064 
He Pro Val Leu Gly Thr Phe Ala Leu Val Ser Tyr lie Ala Asn Lys 
675 $80 685 

GTT CTA ACC GTT CAA ACA ATA GAT AAT GCT TTA AGT AAA AGA AAT GAA 2112 
Val Leu Thr Val Gin Thr He Asp Asn Ala Leu Ser Lys Arg Asn Glu 
690 700 

AAA TGG GAT GAG GTC TAT AAA TAT ATA GTA ACA AAT TGG TTA GCA AAG 2160 
Lvs Trp Asp Glu Val Tyr Lys Tyr lie Val Thr Asn Trp Leu Ala Lys 
705 710 715 720 

GTT AAT ACA CAG ATT GAT CTA ATA AGA AAA AAA ATG AAA GAA GCT TTA 2208 
Val Asn Thr Gin He Asp Leu He Arg Lys Lys Met Lys Glu Ala Leu 
725 730 735 

GAA AAT CAA GCA GAA GCA ACA AAG GCT ATA ATA AAC TAT CAG TAT AAT 2256 
Glu Asn Gin Ala Glu Ala Thr Lys Ala He He Asn Tyr Gin Tyr Asn 
740 745 750 

CAA TAT ACT GAG GAA GAG AAA AAT AAT ATT AAT TTT AAT ATT GAT GAT 2304 
Gin Tyr Thr Glu Glu Glu Lys Asn Asn He Asn Phe Asn He Asp Asp 
755 760 765 

TTA AGT TCG AAA CTT AAT GAG TCT ATA AAT AAA GCT ATG ATT AAT ATA 2352 
Leu Ser Ser Lys Leu Asn Glu Ser lie Asn Lys Ala Met He Asn He 
770 775 780 
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/in * T rn rrr fiTT TCA TAT TTA ATG AAT TCT ATG 

SSiSS-ffiSSS ?S E 5; -» te K 

rrT GTT aaa CGG TTA GAA GAT TTT GAT GCT AGT CTT AAA 
S S £r gS Val £ S 9 Leu Glu Asp Phe Asp Ala Ser Leu Lys 

805 

- ar TAT ATA T AT GAT AAT AGA GGA ACT TTA ATT GGT 

Z S ™ 2i £ £ S. ffi «- »• «« «s Ile 01y 

F 820 

* rr»n"> a aa » rAT AAA GTT AAT AAT ACA CTT AGT ACA GAT 

Si !5 SSSSSg - a « « ». K « ». »p 

835 

^ncn r 1G CTT TCC AAA TAC GTA GAT AAT CAA AGA TTA TTA TCT 
S P- S S£ S Ser Ty. Val Asp Asn Gin Arg Leu Leu Ser 

850 855 

ACA TTT ACT GAA TAT ATT AAG TAA 
S S Thr Glu Tyr lie Lys • 

865 

(2) INFORMATION FOR SEQ ID NO: 2: 

H) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 872 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

.ffii'SSSSSSJSS ,, - B n.». 

„« =m «» «i w 01 ° pte Mn LyB isp K ° " l *5 ° ly 

v.1 Asp .1. Al. Tyr II. Ly, U. «| «-» «• «y "« «° *" 

v.i w. ai. ^1 ^= "« «• IS - "* "» ,al ?r ° ° 1U ^ 

„ Thr pH Thr Ash P» Glu Glu 01, Asp Leu «= Pro Pro Pro Glu 
Al. ,,1 Gl. V.1 Pro V.1 S.r Tyr Tyr A.p s.r Tbr Tyr L.U S.« Thr 

a!p Ash Glu W. »P «* ^ °* ^ ^ 

w II. Tyr s.r Thr Asp » Gly «| «« - — * S "* ^ 

Gly II. » «- «P •>» g ~ llC MP SI ° 1U ^ 

«, U. «P Thr Ash cys II. Ash v.1 II. Gl. Pro Asp Gly S.r Tyr 
M Tr Glu Glu L.u As; «- v.1 II. II. Gly Pro S.r Al. Asp II. 

He Gin Phe Glu Cys Lys Ser Phe Gly His Glu Val Leu Asn Leu Thr 



2400 

2448 

2496 

2544 

2592 

2616 
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Arg Asn Gl y Tyr Gly Ser Thr Gin Tyr Il e Arg phe Ser p 

180 185 3 Ser Pro Asp Phe 

- - «, ». „„ c lu s=r £ „ val ^ Thr ssn ™ ^ ^ 
«r P„e „. ^ „ ,„ Ma vsl ^ 2 S m> ms ^ 

Leu Ile His Ala Gly His Arg Leu j_ , 

225 230 ^ «| Ala He Asn Pro Asn 

Arg Val Phe Lys Val Asn Thr Asn Ala nw -n. ° 
245 ASn Ala gj Tyr Olu Met Ser Gly ^ 

Glu Val Ser Phe Glu Glu Leu Arq Thr Ph m ^ 

260 Arg Thr Phe Gly Gly His Asp Ala Lys 

Phe lie Asp Ser Leu Gin Glu Asn Glu Phe Arc l.„ «n, 

275 280 Leu Tyr Tyr Asn 

285 

j;. *, P n. Ma s „„ leu ^ ^ ^ ^ ^ ^ ^ 
ffi - * ffi «. ^ „ et ^ ^ ™ phe ^ ^ 

^ - L e„ S „ gj ^ Ihr ser 01y ^ sm v ^ ^ ^ «• 

- * Jj. * u ^ Lye „ et ^ Thr 01u ^ »' ^ 

Asn Phe Val Lys Phe Phe Lys Val L~, »«„ » 

355 So eU ASn Lys Thr Tyr Leu. Asn 

Phe Asp Lys Ala Val Phe Lys lie Asn ti- „ , 

370 375 " *" 116 Val P « Ly S val Asn Tyr 

Thr lie Tyr Asp Gly Phe Asn Leu Arg Asn Thr » T 

385 390 3 ^ n Thr Asn Leu Ala Ala Asn 

Phe Asn Gly Gin Asn Thr Glu lie Asn Asn Net a u ^ 
405 Asn Met Asn Phe Thr Lys Leu 

415 

Lys Asn Phe Thr Gly Leu Phe Glu Phe iw , 

420 U *g Ty* Lys Leu Leu Cys Val Arg 

Gly He lie Thr Ser Lys Thr Lys Ser r„, » 

435 7 " % 5 0 Ser L *u Asp Lys Gly Tyr Asn Lys 

445 

Ala Leu Asn Asp Leu Cys lie Lys Val Asn Asn Tm a T 

450 455 n T 3 ^ Leu ^e Phe 

460 

Ser Pro Ser Glu Asp Asn Phe Thr Asn Asn l*„ * T 

465 470 Leu *■» Lys Gly Glu Glu 

He Thr Ser Asp Thr Asn lie Glu Ala Ala Glu ni » - ^ 

485 ■ La Glu Glu Asn lie Ser Leu 

495 

Asp Leu lie Gin Gin Tyr Tyr Leu Thr- th, . 

500 ^ Thr Phe Asn Phe Asp Asn Glu Pro 

sio 

Glu Asn He Ser n e Glu Asn Leu Ser S^- a Tn 

515 £o Ser AS P Ile He Gly Gin Leu 
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01„ Leu Met Pro Asn lie Glu Arg Phe Pro Asn Gly Lys Lys Tyr Glu 

Le u Z Tyr Thr Met Phe His Tyr Leu Arg Ala Gin Glu Phe Glu 
545 550 

His Gly Lys Ser Arg He Ala Leu Thr Asn Ser Val Asn Glu Ala Leu 

565 

Leu Ash Pro Ser Arg val Tyr Thr Phe Phe S.r S.r Asp Tyr Val Lys 

W s val Ash Ala Thr Glu Ala Ala H« Ph. Leu Gly Trp Val CI. 

595 bUU 
G l„ jjj val Tyr Asp Ph. Thr Asp Glu Thr Ser Glu val Ser Thr Thr 

MP "1 » a 116 Ile I1= «! 1,1 Ile 01y Pr ° SS 

625 

Leu Asn II. Gly Ash »et Leu Tyr Lys Asp Asp Phe Val Gly Ala Leu 

64b 

Ile Phe Ser Gly Ala Val He Leu Leu Glu Phe He Pro Glu He Ala 

660 655 
n . Pro val Leu Gly Thr Ph. Ala Leu Val S.r Tyr lie Ala Asa Lys 

675 bOU 
Val Leu Thr Val Gin Thr lie Asp Asn Ala Leu Ser Lys Arg Asn Glu 

690 695 
Lys Trp Asp Glu Val Tyr Lys Tyr II. val Thr Ash Trp Leu Ala Lys 
705 

val A.» Thr Gin IX. Asp Leu XI. Arg Ly, Lys Met Lys Glu Ala L.U 

Olu Ash *. Al. Glu Ala Thr Lys Al. II. He Asn Tyr «g Tyr As» 

740 

«. Tyr Thr Glu Glu Glu Ly. As; As„ U. Ash Ph. Asn II. Asp ASP 

755 /bU 
^ ser ser Ly. I~ As. Glu S.r II. Ash Lys Ala «et II. As. II. 

770 

- ra c _- Va i S er Tyr Leu Met Asn Ser Met 
Asn Lys Phe Leu Asn Gin Cys Ser Val Ser Tyr M0 

785 

lie Pro Tyr Gly Val Lys Arg L» Glu Asp Phr Asp Al. S.r Leu Ly. 

M p Ma Leu Leu Ly, Tyr He Tyr g Ash Arg Gly Thr jjj II. Gly 

al„ val Asp Z L.U Lys Asp Ly. 1 Ash Ash Thr Leu s.r Thr Asp 
835 ** u 

T cr- tvb Tvr Val Asp Asn Gin Arg Leu Leu Ser 
He Pro Phe Gin Leu Ser Lys Tyr vai a p ^ 



850 

Thr Phe Thr Glu Tyr lie Lys * 

865 870 
(2) INFORMATION FOR SEQ ID NO: 3: 



48 

Leu 

15 

96 



144 
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(iJ SEQUENCE CHARACTERISTICS • 

B TYPE: nucleic apid 
C STRANDEDNESS: double 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1 . . 2685 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3 • 
GGA TCC CCA GGA ATT CAT h-rr nnr, ™~ 

«r - - ox, n. S J» £ S S 2; S - « « 2 

GAA TTC GAG CTC CCG GGT ACC ATr far- ™ 1S 

«. - «. ^ „ r Ih < ™ ffi K -j - - ffi ™ £c 

ssEsassssEssssse 

175 

190 



192 



240 



288 



336 



384 



432 



460 



528 



576 
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- r~mr* *nn ttp arr TTC GAG GAG AGC CTG GAG 

S S 5 £ « s » ^ - s " s S olu s " " u Glu 

210 215 

RR _ rrr CTG TTG GGT GCA GGC AAG TTC GCA ACT GAT CCA 
£ Sp S Pro S S Sy Ala Bly g- *» ™ "> "J 

225 230 

ssgasssssssssssss 
S sssssssssssss ss 

275 

s s s a s s i = s ss s = a as s 

355 * ou - 

370 

sasssssssissassss 

He Val Pro Lys Val Asn Tyr 4 ' Q 415 

405 

~ **t ppt TAA AAT ACA GAA ATT AAT 

ESSES S K S S S - « - s Ile ~ 

420 

435 

— »fSS55SSSSS£S2 

Tyr Lys Leu Leu Cys Val *rg i- y 460 

450 



672 



720 



768 



816 



864 



912 



960 



1008 



1056 



1104 



1152 



1200 



1296 



1344 



1392 



1440 
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AAT AAT TGG GAC TTG TTT TTT AGT CCT TCA GAA GAT AAT TTT ACT AAT 1488 
Asn Asn Trp Asp Leu Phe Phe Ser Pro Ser Glu Asp Asn Phe Thr Asn 

485 , t( 490 495 

GAT CTA AAT AAA GGA GAA GAA ATT ACA TCT GAT ACT AAT ATA GAA GCA 1536 
asn Leu Asn Lys Gly Glu Glu lie Thr Ser Asp Thr Asn lie Glu Ala 
P 500 505 510 

GCA GAA GAA AAT ATT AGT TTA GAT TTA ATA CAA CAA TAT TAT TTA ACC 1584 
Ala Glu Glu Asn lie Ser Leu Asp Leu lie Gin Gin Tyr Tyr Leu Thr 
515 520 525 

TTT AAT TTT GAT AAT GAA CCT GAA AAT ATT TCA ATA GAA AAT CTT TCA 1632 
Phe Asn Phe Asp Asn Glu Pro Glu Asn lie Ser He Glu Asn Leu Ser 
530 535 540 

AGT GAC ATT ATA GGC CAA TTA GAA CTT ATG CCT AAT ATA GAA AGA TTT 1680 
ser Asp He He Gly Gin Leu Glu Leu Met Pro Asn He Glu Arg Phe 
545 550 555 560 

CCT AAT GGA AAA AAG TAT GAG TTA GAT AAA TAT ACT ATG TTC CAT TAT 1728 
Pro Asn Gly Lys Lys Tyr Glu Leu Asp Lys Tyr Thr Met Phe His Tyr 
565 570 575 

CTT CGT GCT CAA GAA TTT GAA CAT GGT AAA TCT AGG ATT GCT TTA ACA 1776 
Leu Arq Ala Gin Glu Phe Glu His Gly Lys Ser Arg lie Ala Leu Thr 
* 580 585 590 

AAT TCT GTT AAC GAA GCA TTA TTA AAT CCT AGT CGT GTT TAT ACA TTT 1824 
Asn Ser Val Asn Glu Ala Leu Leu Asn Pro Ser Arg Val Tyr Thr Phe 
5 g5 600 605 

TTT TCT TCA GAC TAT GTA AAG AAA GTT AAT AAA GCT ACG GAG GCA GCT 1872 
Phe Ser Ser Asp Tyr Val Lys Lys Val Asn Lys Ala Thr Glu Ala Ala 
610 615 620 

ATG TTT TTA GGC TGG GTA GAA CAA TTA GTA TAT GAT TTT ACC GAT GAA 1920 
Met Phe Leu Gly Trp Val Glu Gin Leu Val Tyr Asp Phe Thr Asp Glu 
625 630 635 640 

ACT AGC GAA GTA AGT ACT ACG GAT AAA ATT GCG GAT ATA ACT ATA ATT 1968 
Thr Ser Glu Val Ser Thr Thr Asp Lys He Ala Asp He Thr He He 
645 650 655 

ATT CCA TAT ATA GGA CCT GCT TTA AAT ATA GGT AAT ATG TTA TAT AAA 2016 
He Pro Tyr He Gly Pro Ala Leu Asn He Gly Asn Met Leu Tyr Lys 
660 655 670 

GAT GAT TTT GTA GGT GCT TTA ATA TTT TCA GGA GCT GTT ATT CTG TTA 2064 
Asp Asp Phe Val Gly Ala Leu He Phe Ser Gly Ala Val He Leu Leu 
675 680 685 

GAA TTT ATA CCA GAG ATT GCA ATA CCT GTA TTA GGT ACT TTT GCA CTT 2112 
Glu Phe He Pro Glu He Ala He Pro Val Leu Gly Thr Phe Ala Leu 
690 6^5 700 

GTA TCA TAT ATT GCG AAT AAG GTT CTA ACC GTT CAA ACA ATA GAT AAT 2160 
Val Ser Tyr He Ala Asn Lys Val Leu Thr Val Gin Thr He Asp Asn 
70 5 710 715 720 

GCT TTA AGT AAA AGA AAT GAA AAA TGG GAT GAG GTC TAT AAA TAT ATA 2208 
Ala Leu Ser Lys Arg Asn Glu Lys Trp Asp Glu Val Tyr Lys Tyr He 
725 730 735 

GTA ACA AAT TGG TTA GCA AAG GTT AAT ACA CAG ATT GAT CTA ATA AGA 2256 
Val Thr Asn Trp Leu Ala Lys Val. Asn Thr Gin He Asp Leu He Arg 
740 7 *5 7 50 
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M ATG AAA GAA GCT TTA GAA AAT CAA GCA GAA GCA ACA AAG GCT 

^ Si £» Ala Leu ;g Asn Gln Ala Glu %l Thr Lys Ala 

755 



ATA ATA AAC TAT CAG 
He He Asn Tyr Gin 
770 

ATT AAT TTT AAT ATT 
He Asn Phe Asn He 
785 

AAT AAA GCT ATG ATT 
Asn Lys Ala Met lie 



TAT AAT CAA TAT ACT GAG GAA GAG AAA AAT AAT 
Tvr Asn Gin Tyr Thr Glu Glu Glu Lys Asn Asn 
1 775 780 

GAT GAT TTA AGT TCG AAA CTT AAT GAG TCT ATA 
Asp Asp Leu Ser Ser Lys Leu Asn Glu Ser He 
790 795 800 

AAT ATA AAT AAA TTT TTG AAT CAA TGC TCT GTT 
Asn He Asn Lys Phe Leu Asn Gin Cys Ser Val 
810 815 



^ _ A ATG AAT TCT ATG ATC CCT TAT GGT GTT AAA CGG TTA GAA 

S Sr Su OS Asn Ser Met He Pro Tyr Oly Val Lys Arg Leu Glu 
820 

™-r rrr AGT CTT AAA GAT GCA TTA TTA AAG TAT ATA TAT GAT 
Z S ASP S 22 Su Lys Asp Ala Leu Leu Lys Tyr He Tyr Asp 

835 B4U 



AAT AGA GGA ACT 
Asn Arg Gly Thr 
850 

AAT AAT ACA CTT 
Asn Asn Thr Leu 
865 

GAT AAT CAA AGA 
Asp Asn Gin Arg 



TTA ATT GGT CAA GTA GAT AGA TTA AAA GAT AAA GTT 
Leu lie Gly Gin Val Asp Arg Leu Lys Asp Lys Val 
855 860 

AGT ACA GAT ATA CCT TTT CAG CTT TCC AAA TAC GTA 
Ser Thr Asp He Pro Phe Gin Leu Ser Lys Tyr Val 
870 875 oou 

TTA TTA TCT ACA TTT ACT GAA TAT ATT AAG TAA 
Leu Leu Ser Thr Phe Thr Glu Tyr He Lys * 
885 890 895 



2304 



2352 



2400 



2448 



2496 



2544 



2592 



2640 



2685 f. 



(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 895 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 



Gly Ser Pro 



Gly He His Met Thr Ser Thr Arg Leu Gin Lys Leu Leu 



Glu Phe Glu Leu Pro Gly Thr Met Glu Phe Val Asn Lys Gin P_he Asn 
20 2b 
Lys Asp Pro Val Asn Gly Val Asp He Ala Tyr lie Lys He Pro 

Lys Tyr Gly Gin mt Gin Pro Val Lys Ala Phe Lys He His Asn Lys 

He Trp Val He Pro Glu Arg Asp Thr Phe Thr Asn Pro Glu Glu Gly 

Z L« Pro Pro Pro Glu Ala Lys Gin Val Pro Val Ser Tyr Tyr 

85 

ASP Ser Thr Tyr Leu Ser Thr Asp Asn Glu Lys Asp Asn Tyr Leu Lys 



100 



WO 98/07864 

. 40 . PCT/GB97/02273 

Gly Val Thr Lys Leu Phe Glu Ara H- t, „ 

115 i 2 9 0 Ser Thr JJP Gly Arg 

Met Leu Leu Thr* Qp r tr i 

,1. ga „ „, Itt Pro ^ Tcp oiy ^ ^ 

Thr He Asp Thr Glu Leu Lvs Val tt- * 

145 150 ^ 116 AS P Jg Asn Cys lie Asn Val 

He Gin Pro Asp Gly Ser Tvr am c 160 

ig y iy r Arg ser Qlu Leu ^ 

He Gly Pro Ser Ala Asp lie H s r1n BK „, 

180 116 gj? p he Glu Cys L ys Ser Phe Gly 

His Glu Val Leu Asn Leu Thr Arn Sm m 

"5 As * Tyr Gly ser Thr Gin Ty r 

He Arg Phe Ser Pro Asp Phe Thr Phe *° 5 

210 215 y Pfte G * u Glu Ser Leu Glu 

Val Asp Thr Asn Pro Leu Leu Gly Ala Glv t n u 

225 230 y Ma g| Phe Ala Thr Asp Pro 

Ala Val Thr Leu Ala His Glu Leu He His ai. n, ^ 
245 U 116 J" Ala Gly His Arg Leu Tyr 

255 

Gly He Ala He Asn Pro Asn Ara Val t 

260 n Arg Val Phe Lys Val Asn Thr Asn Ala 

Tyr Tyr Glu Met Ser Gly Leu Qlu * 7 ° 

275 280 me Glu Glu Leu Arg Thr 

- «J «i- H is ^ Ua g. phe ne fep ^ J Mu ^ ^ 

«. ^ ^ ^ ^ gj ian Lys Phe Lys ^ m m> ^ ^ ^ 

Asn Lys Ala Lys Ser He Val Gly Thr Thr n e 

325 Thr Thr Ala Ser Leu Gin Tyr Met 

335 

Lys Asn Val Phe Lys Glu Lys Tvr r»„ i 

3 40 Y ^ s Tyr Leu Leu Ser Glu Asp Thr Ser Gly 

*s Phe S| r Val Asp Lys Leu Lys Phe Asp Lys Leu Tyr Lys Met Leu 
Thr Glu ne ^ Thr Glu ^ ^ ^ ^ ^ ^ ™ ^ ^ ^ 

J- Arg Lys Thr Tyr Leu Asn Phe Asp Lys Ala val Phe Lys lie Asn 

He Val Pro Lys Val Asn Tyr Thr n. ^ , 4 °° 
405 Thr Ue Phe Asn Leu Arg 

415 

Thr to jj. Ma Ma ^ phe ^ ^ ^ ^ ^ ^ ^ 

*" " et js Phe Ihr Lys Le - a *- fh « »- g * - ~ «. «. 

445 

Tyr Lys Leu Leu Cys Val Ara Glv ti* ti 

450 ~f 7 116 Thr Ser L VS Thr Lys Ser 
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Leu Asp Lys Gly Tyr Asn Lys Ala Leu Asn Asp Leu Cys lie Lys Val 
465 470 475 480 

Asn Asn Trp Asp Leu Phe Phe Ser Pro Ser Glu Asp Asn Phe Thr Asn 
485 490 495 

Asp Leu Asn Lys Gly Glu Glu lie Thr Ser Asp Thr Asn lie Glu Ala 
500 505 510 

Ala Glu Glu Asn lie Ser Leu Asp Leu lie Gin Gin Tyr Tyr Leu Thr 
515 520 525 

Phe Asn Phe Asp Asn Glu Pro Glu Asn lie Ser lie Glu Asn Leu Ser 
530 535 540 

Ser Asp He He Gly Gin Leu Glu Leu Met Pro Asn He Glu Arg Phe 
545 550 555 560 

Pro Asn Gly Lys Lys Tyr Glu Leu Asp Lys Tyr Thr Met Phe His Tyr 
565 570 575 

Leu Arg Ala Gin Glu Phe Glu His Gly Lys Ser Arg He Ala Leu Thr 
580 585 590 

Asn Ser Val Asn Glu Ala Leu Leu Asn Pro Ser Arg Val Tyr Thr Phe 
595 600 605 

Phe Ser Ser Asp Tyr Val Lys Lys Val Asn Lys Ala Thr Glu Ala Ala 
610 615 620 

Met Phe Leu Gly Trp Val Glu Gin Leu Val Tyr Asp Phe Thr Asp Glu 
€2 5 630 635 640 

Thr Ser Glu Val Ser Thr Thr Asp Lys lie Ala Asp He Thr He He 
645 650 655 

He Pro Tyr He Gly Pro Ala Leu Asn He Gly Asn Met Leu Tyr Lys 
660 665 670 

Asp Asp Phe Val Gly Ala Leu He Phe Ser Gly Ala Val He Leu Leu 
675 680 685 

Glu Phe He Pro Glu lie Ala He Pro Val Leu Gly Thr Phe Ala Leu 
690 . 695 700 

Val Ser Tyr He Ala Asn Lys Val Leu Thr Val Gin Thr lie Asp Asn 
705 710 715 720 

Ala Leu Ser Lys Arg Asn Glu Lys Trp Asp Glu Val Tyr Lys Tyr lie 
725 730 735 

Val Thr Asn Trp Leu Ala Lys Val Asn Thr Gin lie Asp Leu He Arg 
740 745 750 

Lys Lys Met Lys Glu Ala Leu Glu Asn Gin Ala Glu Ala Thr Lys Ala 
755 760 765 

He He Asn Tyr Gin Tyr Asn Gin Tyr Thr Glu Glu Glu Lys Asn Asn 
770 775 780 

He Asn Phe Asn He Asp Asp Leu Ser Ser Lys Leu Asn Glu Ser He 
785 790 795 800 

Asn Lys Ala Met lie Asn lie Asn Lys Phe Leu Asn Gin Cys Ser Val 
805 810 * 815 
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Ser Tyr Leu Met Asn Ser Met lie t „, 

He Pro Tyr Gly v.l Lys Arg Leu Glu 

Asp Phe Asp Ala Ser Leu Lvs Asn ai= t 

835 Lys Asp Ala Leu Leu Lys Tyr Ile ^ ^ 

Asn Arg Gly Thr Leu lie Glv Gin v*i a 

850 Xs Val AS P J|» Lys Asp Lys Val 

Asn Asn Thr Leu Ser Thr- n 

*"> "« *~ "» Jlj U. S . r Lys ^ vu 

»P »- «. Ax 3 Jjj _ Tht ae ^ ne ^ _ - 

DQC 

(2) INFORMATION FOR SEQ ID NO: 5 : 



(i) SEQUENCE CHARACTERISTICS • 

S) SI™ 1 2 f 22 base P a i" 
/£( H E: nucl eic acid 
(C STRAND EDNESS : double 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic, 

(ix, FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION:!. .2622 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO- 5 

S2S5EE5S2SSS5SSS 

100 * 1Q | ^9 Met Leu Leu Thr Ser 

110 

x2 «? g m s S E g « j- s - f c „ ^ 

115 12 £ r ser Thr He Asp Thr Glu 

125 

140 ^ 



48 



96 



144 



192 



240 



288 



336 



384 



432 
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mr*m nna razv PTT AAC CTC GTA ATC ATC GGG CCC TCC GCG 
g £ £ E " £ " « « v al Ue n. «y »» ser 
145 150 

^ a-rr PAG TTT GAG TGC AAG AGC TTT GGC CAC GAA GTG TTG AAC 
Z S tte £n S SS Cys Lys Ser Phe Gly His Glu Val Leu Asn 

165 

aar TPT TAC GGC TCT ACT CAG TAC ATT CGT TTC AGC CCA 

2 S £ » " 5? 25 ~ si G1 " Ile " 9 5! s " pr ° 

^ ™o rrT TTC GAG GAG AGC CTG GAG GTT GAT ACC AAC CCG 
£ S S 2 " £ S5 «. *r u« <a. v,a MP Thr Asn Pro 
195 200 

ATvn ttp rrA ACT GAT CCA GCG GTG ACC CTG GCA 
»2gS"S S S £ « «. V! L.U M. 

210 215 

^ nar rrc GGT CAT CGT CTG TAT GGC ATT GCG ATT AAC 

ss ss s ;ia ss s ss ^ ^ Gly ne n. ^ 

225 230 

^ «nn aar rTT AAC ACC AAC GCC TAC TAC GAG ATG AGT 
S JS- S " S S S IE Shr jjj A!. Ty* Tyr «« » ? t s« 

»rr TTC GAG GAA CTG CGC ACG TTC GGT GGC CAT GAT 

S 2 SS !S £ £ ffi Si - 1« «- «- «» lis «■ "* 

260 Jbi> 
^ rar arc TTG CAG GAG AAC GAG TTC CGT CTG TAC TAC 

S52S5SS2 «. «- «-.- 

275 

^ r*T ATT GCA AGT ACA CTG AAC AAG GCT AAG TCC 

£i£iS£KESE-r«. - u * ws s " 

290 295 

™ „n S rr GCT TCA TTA CAG TAT ATG AAA AAT GTT TTT AAA 
£ vS S S S S Su Gin Tyr Met Lys Asn Val Phe Lys 

305 310 

rrT GAA GAT ACA TCT GGA AAA TTT TCG GTA GAT 

£5 j» 5; 2 2 S Si Sp s -w w. ~ ~ v ?| w 

325 

„_ ,. r -ta TAC AAA ATG TTA ACA GAG ATT TAC ACA 

AAA TTA AAA TTT GAT AAG TTA TAC AAA ^ ^ ^ ^ Thr 

Lys Leu Lys Phe Asp Lys Leu iyr ^y 350 
1 340 J * :> 

«™r, ttt TTT AAA GTA CTT AAC AGA AAA ACA TAT 

GAG GAT AAT TTT GTT AAG TTT TTT AAA GT ^ ^ ^ ^ ^ 

Glu Asp Asn Phe Val Lys Pne mm ^y 3g5 
355 JO 

„„„ ... rrc GTA TTT AAG ATA AAT ATA OTA CCT AAG GTA 
TTG AAT TTT GAT AAA GCC GTA TTT AA^ ^ ^ ne ^ prQ Lyg Val 

Leu Asn Phe Asp Lys Axa v** i ^ Q 

370 ^ 

S S 2 S S? S S S - S SE 52 S S I S 



480 



528 



576 



624 



672 



720 



768 



816 



864 



912 



960 



1008 



1056 



1104 



1152 



1200 



1248 
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AAA CTA AAA AAT TTT ACT GGA TTG TTT GAA TTT TAT AAG TTG CTA TGT 1296 
Lys Leu Lys Asn Phe Thr Gly Leu Phe Glu Phe Tyr Lys Leu Leu Cys 
420 * 425 430 

GTA AGA GGG ATA ATA ACT TCT AAA ACT AAA TCA TTA GAT AAA GGA TAC 1344 
Val Arg Gly He He Thr Ser Lys Thr Lys Ser Leu Asp Lys Gly Tyr 
435 440 445 

AAT AAG GCA TTA AAT GAT TTA TGT ,ATC AAA GTT AAT AAT TGG GAC TTG 1392 
Asn Lys Ala Leu Asn Asp Leu Cys He Lys Val Asn Asn Trp Asp Leu 
450 455 460 

TTT TTT AGT CCT TCA GAA GAT AAT TTT ACT AAT GAT CTA AAT AAA GGA 1440 
Phe Phe Ser Pro Ser Glu Asp Asn Phe Thr Asn Asp Leu Asn Lys Gly 
465 470 475 480 

GAA GAA ATT ACA TCT GAT ACT AAT ATA GAA GCA GCA GAA GAA AAT ATT 1488 
Glu Glu He Thr Ser Asp Thr Asn He Glu Ala Ala Glu Glu Asn He 
485 490 495 

AGT TTA GAT TTA ATA CAA CAA TAT TAT TTA ACC TTT AAT TTT GAT AAT 1536 
Ser Leu Asp Leu He Gin Gin Tyr Tyr Leu Thr Phe Asn Phe Asp Asn 
500 505 510 

GAA CCT GAA AAT ATT TCA ATA GAA AAT CTT TCA AGT GAC ATT ATA GGC 1584 
Glu Pro Glu Asn He Ser He Glu Asn Leu Ser Ser Asp He He Gly 
515 520 525 

CAA TTA GAA CTT ATG CCT AAT ATA GAA AGA TTT CCT AAT GGA AAA AAG 1632 
Gin Leu Glu Leu Met Pro Asn He Glu Arg Phe Pro Asn Gly Lys Lys 
530 535 540 

TAT GAG TTA GAT AAA TAT ACT ATG TTC CAT TAT CTT CGT GCT CAA GAA 1680 
Tvr Glu Leu Asp Lys Tyr Thr Met Phe His Tyr Leu Arg Ala Gin Glu 
545 550 555 560 

TTT GAA CAT GGT AAA TCT AGG ATT. GCT TTA ACA AAT TCT GTT AAC GAA 1728 
Phe Glu His Gly Lys Ser Arg He Ala Leu Thr Asn Ser Val Asn Glu 
565 570 575 

GCA TTA TTA AAT CCT AGT CGT GTT TAT ACA TTT TTT TCT TCA GAC TAT 1776 
Ala Leu Leu Asn Pro Ser Arg Val Tyr Thr Phe Phe Ser Ser Asp Tyr 
580 585 590 

GTA AAG AAA GTT AAT AAA GCT ACG GAG GCA GCT ATG TTT TTA GGC TGG 1824 
Val Lys Lys Val Asn Lys Ala Thr Glu Ala Ala Met Phe Leu Gly Trp 
595 600 605 

GTA GAA CAA TTA GTA TAT GAT TTT ACC GAT GAA ACT AGC GAA GTA AGT 1872 
Val Glu Gin Leu Val Tyr Asp Phe Thr Asp Glu Thr Ser Glu Val Ser 
610 615 620 

ACT ACG GAT AAA ATT GCG GAT ATA ACT ATA ATT ATT CCA TAT ATA GGA 1920 
Thr Thr Asp Lys He Ala Asp lie Thr He He He Pro Tyr He Gly 
625 630 635 640 

CCT GCT TTA AAT ATA GGT AAT ATG TTA TAT AAA GAT GAT TTT GTA GGT 1968 
Pro Ala Leu Asn lie Gly Asn Met Leu Tyr Lys Asp Asp Phe Val Gly 
645 650 655 

GCT TTA ATA TTT TCA GGA GCT GTT ATT CTG TTA GAA TTT ATA CCA GAG 2016 
Ala Leu He Phe Ser Gly Ala Val He Leu Leu Glu Phe lie Pro Glu 
660 665 670 

ATT GCA ATA CCT GTA TTA GGT ACT TTT GCA CTT GTA TCA TAT ATT GCG 2064 
He Ala He Pro Val Leu Gly Thr Phe Ala Leu Val Ser Tyr He Ala 
675 680 685 
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aar aaG GTT CTA ACC GTT CAA ACA ATA GAT AAT GCT TTA AGT AAA AGA 2112 
A^n Lys Val Leu Thr Val Gin Thr lie Asp Asn Ala Leu Ser Lys Arg 

690 

aaT CAA AAA TGG GAT GAG GTC TAT AAA TAT ATA GTA ACA AAT TGG TTA 2160 
J£ ™ Lys Trp Asp Glu Val Tyr Lys Tyr lie Val Thr Asn Trp Leu 



705 



„ OA B I r rTT rat A CA CAG ATT GAT CTA ATA AGA AAA AAA ATG AAA GAA 220 B 

J£ Lyl SS IE Thr Gin lie Asp Leu lie Arg Lys Lys Met Lys Glu 

rn TTA GAA AAT CAA GCA GAA GCA ACA AAG GCT ATA ATA AAC TAT CAG 
S 32 ^ Asn Gin Ala Glu Ala Thr Lys Ala He He Asn Tyr Gin 
740 745 

tat AAT CAA TAT ACT GAG GAA GAG AAA AAT AAT ATT AAT TTT AAT ATT 
™l £1 Sn" Tyr Thr Glu Glu Glu Lys Asn Asn He Asn Phe Asn lie 



755 



„„„ m AGT TCG AAA CTT AAT GAG TCT ATA AAT AAA GCT ATG ATT 

2J Sp 2J Jer Sr Lys Leu Asn Glu Ser lie Asn Lys Ala Met lie 

770 775 
AAT ATA AAT AAA TTT TTG AAT CAA TGC TCT GTT TCA TAT TTA ATG AAT 
IS lie SI Phe Leu Asn Gin Cys Ser Va Ser Tyr Leu Met Asn 
765 790 



«ot *.rr arc CCT TAT GGT GTT AAA CGG TTA GAA GAT TTT GAT GCT AGT 
2 £E He S£ Gly Val Lys Arg Leu Glu Asp Phe Asp Ala Ser 
805 810 

^ r . T GCA tta TTA AAG TAT ATA TAT GAT AAT AGA GGA ACT TTA 

Leu Lys ™ E SS Leu Lys Tyr lie Tyr Asp Asn Arg Gly Thr Leu 
820 825 

*h-p raa GTA GAT AGA TTA AAA GAT AAA GTT AAT AAT ACA CTT AGT 

2SS S3 ^9 X- Lys Asp Lys Val Asn Asn Thr Leu Ser 

835 840 
GAT ATA CCT TTT CAG CTT TCC AAA TAC GTA GAT AAT CAA AGA TTA 
52 i£ S» Phe Gin Leu Ser Lys Tyr Val Asp Asn Gin Arg Leu 
850 855 

TTA TCT ACA TTT ACT GAA TAT ATT AAG TAA 
2 Ser Thr Phe Thr Glu Tyr He Lys * 
865 870 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 874 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Gly Ser Met Glu Phe Val Asn Lys Gin Phe Asn Tyr Lys Asp Pro Val 

1 5 
Asn Gly Val Asp He Ala Tyr He Lys He Pro Lys Tyr Gly Gin Met 

20 " 
Gin Pro Val Lys Ala Phe Lys lie His Asn Lys He Trp Val He Pro 
35 40 



2256 



2304 



2352 



2400 



2448 



2496 



2544 



2592 



2622 
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«. tt; Asp Thr Phe * As„ p ro Glu 01u 01y ^ ^ ^ ^ 

p™ «a. ai, Lys 01n v , ; Pco ^ s „ ^ J p ° ^ ^ ^ ^ 

80 

Ser Thr Asp As„ „ u „ ys ^ to ^ ^ ^ ^ ^ ^ 

95 

Phe «u Aro a. Tyr Ser Thr Asp jj. 01y ^ „ ^ ^ ^ ^ 

a. « g ciy ,u pro Phe ^ Gly Gly ser Ihr ^ ™ Ihr Qlu 

Leu Lys val lie Asp Thr Asn Cys lie Asn v.l ti „ 

130 135 ys *■» »1 «e Gin Pro Asp My 

Ser Tyr Aro Ser clu gj Leu As„ „ u Val g. „, ^ ^ ^ ^ 

«p no „. 0 i„ jj. 0 i„ ^ Lys set oiy hu v ^ ^ ^ 

U 175 
Leu Thr Ar g Asj G!y Tyr 0 i y Ssr 01o ^ ^ ^ ^ 

190 

Asp Phe Thr Phe Gly Phe Glu Glu Ser Leu cn„ ^ t » 

195 20 o Sr LeU Glu Val Asp Thr Asn Pro 

205 

Leu Leu Gly Ala Gly Lys Phe Ala Thy t, 

210 y 78 fi| Ala Thr Pro Ala Val Thr Leu Ala 

His Glu Leu He His Ala Gly His Arc Leu Tyr Gly Ile Ala ^ 
Pro Asn Arg Val Phe Lys Val Asn Thr Asn Ala Tyr Tyr Glu Met Z 
Sly Leu „. jjj ser Phe 01u „ ^ ^ phe ^ 

■ 3 270 
Ala Lys Phe Xle Asp Ser Leu Gin Glu Asn Glu Phe Arg Leu Tyr Tyr 

285 

Tyr As. Lys Phe Lys Asp 11. Ala Ser Thr Leu As„ Lys „. Lys S(!r 

» ™ ^ ^ JS S " - «» «*r J, lZ A.„ v.l Phe Lys 

320 

Glu Lys Tyr Leu Leu Ser Glu Asp Thr Ser rw t 

325 P Gly L ys-Phe Ser Val Asp 

J30 335 
Lys Leu Lys Phe Asp Lys Leu Tyr Lys Met Leu Thr Glu lie Tyr Thr 

3S0 

Glu Asp Asn Phe Val Lys Phe Phe Lvs Val i e „ » 

355 Y Leu A 811 ^9 h YS Thr Tyr 

365 

Leu Asn Phe Asp Lys Ala Val Phe Lys H e Asn II. o 

370 n 7 c y e ASn Iie Val Pro Lys Val 

J/3 380 

Asn Tyr Thr He Tyr Asp Gly Phe Asn Leu Arg Asn Thr Asn Leu Ala 

395 400 
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„, „ ph. ». ffl «- » * «■ 2! Asn Asn Met 5 Thr 

ws W » w. -» - Thr Oly X*» Ph. - - a «- 

«x *, ffl "° «- ** - a; *" Lys set <~ a Lys Gly 11,1 

» w . IT. «. «. -p i- *■ - - JS " ** MP ^ 

450 

Phe Ph . ~ « ~ «» * - - s Mp " u asn tys s 
1" «« u. - sj «- - - - ss "* A1 * Glu Glu a ue 

s „ ^ u MP gj U. «n «. TV, TJX - « «- - * - « 

Glu « «» 1 .U ~ XX. Jg - « S.r s.r Jg XX. XX. SXy 

G1 „ „ Txl « « « -j - « - SS Mn ° ly Lvs *" 

„ Z ^ >. e ~ - - His si "» m * oi " as 

Z «. HX. *Xy g. S.r « XX. Ma g. - « - V.X - - 

u , L .» „ » « - ~ JB 181 phe ' Pb= s " S MP * 

« W . - - - - Thr CX tt AXa M« ~ Jg - * ~ 
v.x 0 X» "I « vax Tyr Jg Ph. - -P «"» gj « " l ~ 
_ « x,s xx, Jg -P xx. Thr xx. ». XX. Pro Tyr xx. CXy 

" Ala lieu « - « « « SS «" "* ~ ~ - "* 

u , ^ XX. Ph. - Cly Ala V, XX. — - - gj - - 

660 

„. Ala XX. Pro vax - OX, Thr P.. - - V.X Jg Tyr XX. « 

Mn Lys « » T« val Jg T.X XI. xsp « »g - « ^ - 

690 

Mn ,X„ Trp M P OX; V.X Tyr Tyr XX. Val * « » gj 

T. val « gj - - - - III - ^ ^ - & 

Ala « OU, - «- M. - - " S ?S ^ 01 " 
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As„ ^ ^ Tit Mu Glu 

76,0 • L - ie Asn Phe Asn n e 

" ™ ^ ~ ^ & - »" - »- - Ala ^ lu 

- - - - ta 01 „ ^ s „ ^ J Tyt m ^ ^ 

Ser Met lie Pro Tvr Glv v a i T 800 

gj ~ ffi «. Asp Phe isp lla ser 

Leu Lys Asp Ala Leu Leu Lys Tyr lie Tyr Asp Asn Arg Gly Thr Le u 
XX. «, gj V,, Asp A. Leu Lys Asp Lys v al " Uu ^ 

Thr Asp He Pro Phe Gln Leu 

850 855 yS ^ Val Asp Asn Gln Arg ^ 

Jeu Ser Thr Phe Thr Glu Tyr He Lys * 

870 

12) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 2613 base • 

(B) TYPE: nucleic tc!d PairS 
/n fpANDEDNESS: double 
(t» TOPOLOGY: linear 

(ii) MOLECULE TYPE: DMA (genomic) 
(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1 . . 2613 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO- 7 



46 



96 



144 



192 



240 



268 
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AGA ATT TAT TCA ACT GAT CTT GGA AGA ATG TTG TTA ACA TCA ATA GTA 336 
Arg He Tyr Ser Thr Asp Leu Gly Arg Met Leu Leu Thr Ser He Val 
100 105 110 

AGG GGA ATA CCA TTT TGG GGT GGA AGT ACA ATA GAT ACA GAA TTA AAA 384 
Arg Gly He Pro Phe Trp Gly Gly Ser Thr He Asp Thr Glu Leu Lys 
115 120 125 

GTT ATT GAT ACT AAT TGT ATT AAT GTG ATA CAA CCA GAT GGT AGT TAT 432 
Val He Asp Thr Asn Cys lie Asn Val He Gin Pro Asp Gly Ser Tyr 
130 135 140 

AGA TCA GAA GAA CTT AAT CTA GTA ATA ATA GGA CCC TCA GCT GAT ATT 480 
Arq Ser Glu Glu Leu Asn Leu Val lie lie Gly Pro Ser Ala Asp He 
14 | 150 155 160 

ATA CAG TTT GAA TGT AAA AGC TTT GGA CAT GAA GTT TTG AAT CTT ACG 528 
He Gin Phe Glu Cys Lys Ser Phe Gly His Glu Val Leu Asn Leu Thr 
165 170 175 

CGA AAT GGT TAT GGC TCT ACT CAA TAC ATT AGA TTT AGC CCA GAT TTT 576 
Arq Asn Gly Tyr Gly Ser Thr Gin Tyr He Arg Phe Ser Pro Asp Phe 
180 185 190 

ACA TTT GGT TTT GAG GAG TCA CTT GAA GTT GAT ACA AAT CCT CTT TTA 624 
Thr Phe Gly Phe Glu Glu Ser Leu Glu Val Asp Thr Asn Pro Leu Leu 
195 200 205 

GGT GCA GGC AAA TTT GCT ACA GAT CCA GCA GTA ACA TTA GCA CAT GAA 672 
Gly Ala Gly Lys Phe Ala Thr Asp Pro Ala Val Thr Leu Ala His Glu 
210 215 220 

CTT ATA CAT GCT GGA CAT AGA TTA TAT GGA ATA GCA ATT AAT CCA AAT 720 
Leu He His Ala Gly His Arg Leu Tyr Gly He Ala lie Asn Pro Asn 
225 230 235 . 240 

AGG GTT TTT AAA GTA AAT ACT AAT GCC TAT TAT GAA ATG AGT GGG TTA 768 
Act Val Phe Lys Val Asn Thr Asn Ala Tyr Tyr Glu Met Ser Gly Leu 
* 245 . 250 255 

GAA GTA AGC TTT GAG GAA CTT AGA ACA TTT GGG GGA CAT GAT GCA AAG 816 
Glu Val Ser Phe Glu Glu Leu Arg Thr Phe Gly Gly His Asp Ala Lys 
260 265 270 

TTT ATA GAT AGT TTA CAG GAA AAC GAA TTT CGT CTA TAT TAT TAT AAT 864 
Phe He Asp Ser Leu Gin Glu Asn Glu Phe Arg Leu Tyr Tyr Tyr Asn 
275 280 285 

AAG TTT AAA GAT ATA GCA AGT ACA CTT AAT AAA GCT AAA TCA ATA GTA 912 
Lys Phe Lys Asp He Ala Ser Thr Leu Asn Lys Ala Lys Ser He Val 
290 295 300 

GGT ACT ACT GCT TCA TTA CAG TAT ATG AAA AAT GTT TTT AAA GAG AAA 960 
Gly Thr Thr Ala Ser Leu Gin Tyr Met Lys Asn Val Phe Lys Glu Lys 
305 310 315 320 

TAT CTC CTA TCT GAA GAT ACA TCT GGA AAA TTT TCG GTA GAT AAA TTA 1008 
Tvr Leu Leu Ser Glu Asp Thr Ser Gly Lys Phe Ser Val Asp Lys Leu 
J 325 330 335 

AAA TTT GAT AAG TTA TAC AAA ATG TTA ACA GAG ATT TAC ACA GAG GAT 1056 
Lvs Phe Asp Lys Leu Tyr Lys Met Leu Thr Glu He Tyr Thr Glu Asp 
340 345 350 

AAT TTT GTT AAG TTT TTT AAA GTA CTT AAC AGA AAA ACA TAT TTG AAT 1104 
Asn Phe Val Lys Phe Phe Lys Val Leu Asn Arg Lys Thr Tyr Leu Asn 
355 360 365 
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TTT GAT AAA GCC GTA TTT AAG ATA AAT ATA GTA CCT AAG GTA AAT TAC 1152 
Phe Asp Lys Ala Val Phe Lys I^e Asn lie Val Pro Lys Val Asn Tyr 
370 375 *' 380 

ACA ATA TAT GAT GGA TTT AAT TTA AGA AAT ACA AAT TTA GCA GCA AAC 1200 
Thr lie Tyr Asp Gly Phe Asn Leu Arg Asn Thr Asn Leu Ala Ala Asn 
385 390 395 400 

TTT AAT GGT CAA AAT ACA GAA ATT AAT AAT ATG AAT TTT ACT AAA CTA 1248 
Phe Asn Gly Gin Asn Thr Glu lie Asn Asn Met Asn Phe Thr Lys Leu 
405 410 415 

AAA AAT TTT ACT GGA TTG TTT GAA TTT TAT AAG TTG CTA TGT GTA AGA 1296 
Lys Asn Phe Thr Gly Leu Phe Glu Phe Tyr Lys Leu Leu Cys Val Arg 
420 425 430 

GGG ATA ATA ACT TCT AAA ACT AAA TCA TTA GAT AAA GGA TAC AAT AAG 1344 
Gly He He Thr Ser Lys Thr Lys Ser Leu Asp Lys Gly Tyr Asn Lys 
435 440 445 

GCA TTA AAT GAT TTA TGT ATC AAA GTT AAT AAT TGG GAC TTG TTT TTT 1392 
Ala Leu Asn Asp Leu Cys He Lys Val Asn Asn Trp Asp Leu Phe Phe 
450 455 460 

AGT CCT TCA GAA GAT AAT TTT ACT AAT GAT CTA AAT AAA GGA GAA GAA 1440 
Ser Pro Ser Glu Asp Asn Phe Thr Asn Asp Leu Asn Lys Gly Glu Glu 
465 470 475 480 

ATT ACA TCT GAT ACT AAT ATA GAA GCA GCA GAA GAA AAT ATT AGT TTA 1408 
He Thr Ser Asp Thr Asn He Glu Ala Ala Glu Glu Asn He Ser Leu 
485 490 495 

GAT TTA ATA. CAA CAA TAT TAT TTA ACC TTT AAT TTT GAT AAT GAA CCT 1536 
Asp Leu He Gin Gin Tyr Tyr Leu Thr Phe Asn Phe Asp Asn Glu Pro 
500 505 510 

GAA AAT ATT TCA ATA GAA AAT CTT TCA AGT GAC ATT ATA GGC CAA TTA 1584 
Glu Asn He Ser He Glu Asn Leu Ser Ser Asp He He Gly Gin Leu 
515 520 525 

GAA CTT ATG CCT AAT ATA GAA AGA TTT CCT AAT GGA AAA AAG TAT GAG 1632 
Glu Leu Met Pro Asn lie Glu Arg Phe Pro Asn Gly Lys Lys Tyr Glu 
530 535 540 

TTA GAT AAA TAT ACT ATG TTC CAT TAT CTT CGT GCT CAA GAA TTT GAA 1680 
Leu Asp Lys Tyr Thr Met Phe His Tyr Leu Arg Ala Gin Glu Phe Glu 
545 550 555 560 

CAT GGT AAA TCT AGG ATT GCT TTA ACA AAT TCT GTT AAC GAA GCA TTA 1728 
His Gly Lys Ser Arg lie Ala Leu Thr Asn Ser Val Asn Glu Ala Leu 
565 570 525 

TTA AAT CCT AGT CGT GTT TAT ACA TTT TTT TCT TCA GAC TAT GTA AAG 1776 
Leu Asn Pro Ser Arg Val Tyr Thr Phe Phe Ser Ser Asp Tyr Val Lys 
580 585 590 

AAA GTT AAT AAA GCT ACG GAG GCA GCT ATG TTT TTA GGC TGG GTA GAA 1824 
Lys Val Asn Lys Ala Thr Glu Ala Ala Met Phe Leu Gly Trp Val Glu 
595 600 605 

CAA TTA GTA TAT GAT TTT ACC GAT GAA ACT AGC GAA GTA AGT ACT ACG 1872 
Gin Leu Val Tyr Asp Phe Thr Asp Glu Thr Ser Glu Val Ser Thr Thr 
610 615 620 

GAT AAA ATT GCG GAT ATA ACT ATA ATT ATT CCA TAT ATA GGA CCT GCT 1920 
Asp Lys He Ala Asp He Thr He He He Pro Tyr He Gly Pro Ala 
62 5 • 630 635 640 
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« ,«t *tt TTA TAT AAA GAT GAT TTT GTA GGT GCT TTA 

™ ffi g & ss s a ™ E jg «• «- - i oi * s: - 

«.» rCT GTT ATT CTC TTA GAA TTT ATA CCA GAG ATT OCA 
S SESS5S ». Glu «. »* «. Gl» Ue A!. 

660 

n-r-r nrr TTT GCA CTT GTA TCA TAT ATT GCG AAT AAG 
ATA f 1 ££ S S S S S Leu Val Ser Tyr lie Ala As. Lys 

He Pro Val Leu ^ 685 

675 

sssssssssissassssss 

690 

705 

B -gS5S2S5£iSSSaSS 

saasBsssEBsssass 

740 

ssssaassssssssss 

SSS»2S=S.= S = *S" SSS 

770 

805 

-saassaaaaassass 
saaaas=S SS!=s2 J ss5 

835 

Ara ccr TTT CAG CTT TCC AAA TAC GTA GAT AAT CAA AGA TTA TTA TCT 

He Pro Phe Gin Leu Ser £ys ±y 86Q 
850 

ACA TTT ACT GAA TAT ATT AAG 
tS £e Thr Glu Tyr lie Lys 
865 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 871 amxno acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 



1968 



2016 



2064 



2112 



2160 



2208 



2256 



2304 



2352 



2400 



2448 



2496 



2544 



2592 



2613 
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(ii) MOLECULE TYPE: protein 
(») SEQUENCE DESCRIPTION^; SEQ ID N0 . B . 
«* - Phe Val Asn Lys Qln ^ ^ ^ ^ ^ ^ a ^ ^ 



Val Asp lie Ala Tyr lie Lys il e Pro 

20 Pro Asn Ala Gly Gln M ^ Qln ^ 

Val Lys Ala Phe Lys lie His Sm t 

Has Asn Lys lie Trp Val ll f Pro Glu ^ 

Asp Thr Phe Thr Asn Pro Glu Glu Gl„ a , 

50 55 G1U ° ly Le « Asn Pro Pro Pro 

oO 

Ala Lys Gin Val Pro Val Ser -tw . 

70 ^r Thr ^ Leu Ser ^ 

Asp Asn Glu Lys Asp Asn Tyr Leu Lys Glv V,i n>u ^ 
85 LyS G1 9 l Val Thr Lys Leu Phe Glu 

Arg lie Tyr Ser Thr Asp Leu Gly Aro t ^ 

100 y JSf Met Leu Leu T hr Ser il e Val 

Arg Gly He Pro Phe Trp Glv ri v o~ ^ ^ 

115 rp Gly Gly Ser Thr H e Asp ^ Qlu ^ ^ 

Val He Asp Thr Asn Cys lie A^n v=>i n 

130 Y SI ^ ^ Ue Gln J» Asp Gly Ser Tyr 

S - - 1. Leu Asn Leu Val Ile Ile ^ ^ ^ ^ ^ ^ 

ne Gin phe giu si - - - a - v. ^ Asn Leu 

- Asn Gly Tyr Gly Ser Thr Gln ffi Ile ^ pfae _ ^ ™ 

- - - - - - u. - _ _ 
ffi «, , he u , s asp Ma m ^ - Ma ^ ^ 

a u. «. M . Gly g. „ _ ^ Giy ^ ua ii= ^ ^ ^ 

Arg Val Phe Lys Val Asn Thr Asn Al» t, ^ 24 ° 
245 Tto Ala Tyr Tyr Glu Met ser Gly Le u 

Glu Val ser Phe Glu Glu Leu Ara Th* m. „ n 

260 Arg Thr Phe Gly Giy H is Asp Ala Lys 

270 

Phe lie Asp ser Leu Gln Glu Asn Glu nv,„ * 

27 5 JS U Phe Leu ^r Tyr Tyr Asn 

285 

* - „ n. ua f « ^ Lsu ^ Lyj ^ ^ ^ ^ 
S - Ma ser jj. G1 „ ^ „« Lye ^ J ^ Mu ^ 

Le u sec g. „ ar set Qly ^ ^ ^ ^ ^ »• 
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Lys Phe Asp Lys Leu Tyr Lys Met Leu Thr Glu lie Tyr Thr Glu Asp 

2 340 345 

Asn Phe Val Lys Phe Phe Lys Val Leu Asn Arg Lys Thr Tyr Leu Asn 

ASaX i m 36Q 365 

Phe Asp Lys Ala Val Phe Lys lie Asn lie Val Pro Lys Val Asn Tyr 

370 375 
Thr lie Tyr Asp Gly Phe Asn Leu Arg Asn Thr Asn Leu Ala Ala Asn 
385 390 

Glv Gin Asn Thr Glu lie Asn Asn Met Asn Phe Thr Lys Leu 



Phe Asn Giy w.n ™» — ^ 415 

Lys Asn Phe Thr Gly Leu Phe Glu Phe Tyr Lys Leu Leu Cys Val Arg 

420 42b 

riv lie He Thr Ser Lys Thr Lys Ser Leu Asp Lys Gly Tyr Asn Lys 
Giy J-ie 44 q 445 

Ala Leu Asn Asp Leu Cys lie Lys Val Asn Asn Trp Asp Leu Phe Phe 



455 460 
450 



Ser Pro Ser Glu Asp Asn Phe Thr Asn Asp Leu Asn Lys Gly Glu Glu 
465 470 



He Thr Ser Asp Thr Asn He Glu Ala Ala Glu Glu Asn lie Ser Leu 

Asp Leu lie Gin Gin Tyr Tyr Leu Thr Phe Asn Phe Asp Asn Glu Pro 

F 500 505 

Glu Asn He Ser He Glu Asn Leu Ser Ser Asp He lie Gly Gin Leu 

515 bj!U 
Glu Leu Met Pro Asn He Glu Arg Phe Pro Asn Gly Lys Lys Tyr Glu 

530 535 
Leu Asp Lys Tyr Thr Met Phe His Tyr Leu Arg Ala Gin Glu Phe Glu 

His Gly Lys Ser Arg He Ala Leu Thr Asn Ser Val Asn Glu Ala Leu 

L eu Asn Pro Ser Arg Val Tyr Thr Phe Phe Ser Ser Asp Tyr Val Lys 

580 a " 
L ys val Asn Lys Ala Thr Glu Ala Ala Met Phe Leu Gly Trp Val Glu 

595 600 



Gin Leu Val Tyr Asp Phe Thr Asp Glu Thr Ser Glu Val Ser Thr Thr 
610 

Asp Lys lie Ala Asp He Thr He He He Pro Tyr He Gly Pro Ala 

625 

I*u Asn He Gly Asn Met Leu Tyr Lys Asp Asp Phe Val Gly Ala Leu 
645 



lie Phe Ser Gly Ala Val He Leu Leu Glu Phe He Pro Glu He Ala 

660 565 
He Pro val Leu Gly Thr Phe Ala Leu Val Ser Tyr He Ala Asn Lys 



675 680 
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Val Leu Thr Val Gin Thr tv 

lie asp Asn Wa ^ ? „ ^ ^ ^ 

Lys Trp Asp Glu Val Tvt- t,«. i/ 1 

«*. Tyr He val Thr Asn Xrp Leu ^ ^ 

Val Asn Thr Gin U e Asp Leu lie Am r ?2 ° 
725 He Arg Lys Lys Met Lys olu ^ ^ 

Glu Asn Gin Ala Glu Ala Thr Lys Ala n. T , ^ 

740 ^ S Ala He He Asn Tyr Gin Ty r Asn 

Gin Tyr Thr Glu Glu Glu Lys Asm Am T1 ^ 

7S5 ys Asn Asn n e Asn Phe ^ ^ ^ ^ 

Leu Ser Ser Lys Leu Asn Glu Ser ti * 

770 775 116 ASn ^ *}j Met He Asn H e 

* L " ^ ^ » ^ - - - £ - Met Asn ser Met 

ne *o Tyr Gly Val L ys Ar g Leu Glu As, Phe As P Ala Ser Leu Lys 

Asp Ala Leu Leu Lys Tyr 11^ tw * 815 

82o Tyr He Tyr Asp Asn Arg Gly Thr Leu He Gly 

rn„ t, t * 830 
Gin Val Asp Arg Leu Lys Asn r ve v , , 

835 ys Asp Lys Val Asn Asn Thr Leu Ser Thr Asp 

He Pro Phe Gin Leu Ser Lvs Tvr v.i » ' 

850 *** Val *" Jin Arg Leu Leu Ser 

Thr Phe Thr Glu Tyr He Lys . • ° 

865 870 

(2) INFORMATION FOR SEQ ID NO: 9 : 

(i) SEQUENCE CHARACTERISTICS • 

n i^P* 1 2628 ^ase p a i rs 

(B TYPE: nucleic acid 

(C STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(ix) FEATURE: 

(A) NAME/ KEY: CDS 

(B) LOCATION: 1 . . 2628 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO- 9- 

= = = = = SasS5S5gBJE5 

aKsssrssgjssss-gg. 
5f Esss5aaa?s2ss2- Sa 



48 



96 
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^ rv* PTT TCA TAC TAC GAT TCA ACC TAT CTG AGC ACA 

S 5! S! SS p£ SS IS ™ «. "P ^ » ser Thr 

65 70 

aar GAT AAC TAC CTG AAG GGA GTG ACC AAA TTA TTC GAG 
Z £n £u £s Asp £n Tyr Leu Lys Gly Val Thr Lys Leu Phe Glu 

„ n . n ~r r ntzr CGT ATG CTG CTG ACC TCA ATC GTC 

S S S S £ S 2 S ^ « - - »- s IU val 

tot RPT GGC AGT ACC ATT GAC ACG GAG TTG AAG 

-ggS2S5sS»-»-»5 Glu " u L1 " 

115 

_ Trc ATT aac GTG ATC CAA CCA GAC GGT AGC TAC 

SS5SS" S E vsl xi. «. Jg «p «r »» ^ 

130 

~,~n r-rr GTA ATC ATC GGG CCC TCC GCG GAC ATT 

sssssss-ss S iu ffi « ~ «. x a. 

145 150 

or arr TTT GGC CAC GAA GTG TTG AAC CTG ACG 

SSESSSSS S His «. v.1 « - ~ 

165 

nGC TCT ACT CAG TAC ATT COT TTC AGC CCA GAC TTC 
Am un Gly Tyr Gly Ser Thr Gin Tyr lie Arg Phe ser Pro Asp Phe 

3 180 ' 

~m GAG AGC CTG GAG GTT GAT ACC AAC CCG CTG TTG 

S Se SS SK S£ SS Glu Val Asp T^r Asn Pro Leu Leu 

195 

m „ apt GAT CCA GCG GTG ACC CTG GCA CAC GAG 

210 Z " 

„~r. ™t rrT CTG TAT GGC ATT GCG ATT AAC CCG AAC 
SSffiSSSSS" Gly U. Ala Xle A.» Pro jjj 

225 230 

„ « rr TA r TAC GAG ATG AGT GGT TTA 

245 

ssss:ss=ss=s;sas=s 

275 ^° 

-.—-gsssssssssss 

Lys Phe Lys Asp He Ala ser i« 3Q0 

290 

Gly Thr Thr Ala Ser Jbeu * 31g 32 o 

305 

n« TPT GGA AAA TTT TCG GTA GAT AAA TTA 
TAT CTC CTA TCT GAA GAT ACA TCT GGA AAA T ^ ^ Leu 

Tyr Leu Leu Ser Glu Asp Thr Ser tiy ^ 33$ 



240 



288 



336 



384 



432 



480 



528 



576 



624 



672 



720 



768 



816 



864 



912 



960 



1008 
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AAA TTT GAT AAG TTA TAC AAA ATG TTA ACA GAG ATT TAG ACA GAG GAT 1056 
Lvs Phe Asp Lys Leu Tyr Lys Met Leu Thr Glu He Tyr Thr Glu Asp 
* 340 ^ 345 350 

AAT TTT GTT AAG TTT TTT AAA GTA CTT AAC AGA AAA ACA TAT TTG AAT 1104 
Asn Phe Val Lys Phe Phe Lys Val Leu Asn Arg Lys Thr Tyr Leu Asn 
355 360 365 

TTT GAT AAA GCC GTA TTT AAG ATA AAT ATA GTA CCT AAG GTA AAT TAC 1152 
Phe Asp Lys Ala Val Phe Lys He -Asn He Val Pro Lys Val Asn Tyr 
370 375 380 

ACA ATA TAT GAT GGA TTT AAT TTA AGA AAT ACA AAT TTA GCA GCA AAC 1200 
Thr He Tyr Asp Gly Phe Asn Leu Arg Asn Thr Asn Leu Ala Ala Asn 
385 390 395 400 

TTT AAT GGT CAA AAT ACA GAA ATT AAT AAT ATG AAT TTT ACT AAA CTA 1248 
Phe Asn Gly Gin Asn Thr Glu lie Asn Asn Met Asn Phe Thr Lys Leu 
405 410 415 

AAA AAT TTT ACT GGA TTG TTT GAA TTT TAT AAG TTG CTA TGT GTA AGA 1296 
Lvs Asn Phe Thr Gly Leu Phe Glu Phe Tyr Lys Leu Leu Cys Val Arg 
y 420 425 430 

GGG ATA ATA ACT TCT AAA ACT AAA TCA TTA GAT AAA GGA TAC AAT AAG 1344 
Gly He He Thr Ser Lys Thr Lys Ser Leu Asp Lys Gly Tyr Asn Lys 
435 440 445 

AGC GCT GAT GGG GCA TTA AAT GAT TTA TGT ATC AAA GTT AAT AAT TGG 1392 
Ser Ala Asp Gly Ala Leu Asn Asp Leu Cys He Lys Val Asn Asn Trp 
450 455 .460 

GAC TTG TTT TTT AGT CCT TCA GAA GAT AAT TTT ACT AAT, GAT CTA AAT 1440 
Aso Leu Phe Phe Ser Pro Ser Glu Asp Asn Phe Thr Asn Asp Leu Asn 
4$5 470 475 480 

AAA GGA GAA GAA ATT ACA TCT GAT ACT AAT ATA GAA GCA GCA GAA GAA 1488 
Lvs Gly Glu Glu He Thr Ser Asp Thr Asn He Glu Ala Ala Glu Glu 
7 485 490 495 

AAT ATT AGT TTA GAT TTA ATA CAA CAA TAT TAT TTA ACC TTT AAT TTT 1536 
Asn He Ser Leu Asp Leu He Gin Gin Tyr Tyr Leu Thr Phe Asn Phe 
500 505 510 

GAT AAT GAA CCT GAA AAT ATT TCA ATA GAA AAT CTT TCA AGT GAC ATT 1584 
Asp Asn Glu Pro Glu Asn He Ser He Glu Asn Leu Ser Ser Asp He 
515 520 525 

ATA GGC CAA TTA GAA CTT ATG CCT AAT ATA GAA AGA TTT CCT AAT GGA 1632 
He Gly Gin Leu Glu Leu Met Pro Asn He Glu Arg Phe Pro Asn Gly 
530 535 540 

AAA AAG TAT GAG TTA GAT AAA TAT ACT ATG TTC CAT TAT CTT CGT GCT 1680 
Lvs Lys Tyr Glu Leu Asp Lys Tyr Thr Met Phe His Tyr Leu Arg Ala 
545 550 555 560 

CAA GAA TTT GAA CAT GGT AAA TCT AGG ATT GCT TTA ACA AAT TCT GTT 1728 
Gin Glu Phe Glu His Gly Lys Ser Arg He Ala Leu Thr Asn Ser Val 
565 570 575 

AAC GAA GCA TTA TTA AAT CCT AGT CGT GTT TAT ACA TTT TTT TCT TCA 1776 
Asn Glu Ala Leu Leu Asn Pro Ser Arg Val Tyr Thr Phe Phe Ser Ser 
580 585 590 

GAC TAT GTA AAG AAA GTT AAT AAA GCT ACG GAG GCA GCT ATG TTT TTA 1824 
Ast> Tvr Val Lvs Lys Val Asn Lys Ala Thr Glu Ala Ala Met Phe Leu 
V 1 595 €00 605 
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rrr TGG GTA GAA CAA TTA GTA TAT GAT TTT ACC GAT GAA ACT AGC GAA 1872 
Gly Trp Val Glu Gin Leu Val Tyr Asp Phe Thr Asp Glu Thr Ser Glu 
610 615 620 

ArT act ACG GAT AAA ATT GCG GAT ATA ACT ATA ATT ATT CCA TAT 1920 
S3 Ser 55 Thr Asp Lys lie Ala Asp lie Thr He lie lie Pro Tyr 

625 

ata GGA CCT GCT TTA AAT ATA GGT AAT ATG TTA TAT AAA GAT GAT TTT 1968 
lie Glv Pro Ala Leu Asn He Gly Asn Met Leu Tyr Lys Asp Asp Phe 

1J * e * £45 650 655 

r - T got TTA ATA TTT TCA GGA GCT GTT ATT CTG TTA GAA TTT ATA 2016 
S3 % S Leu lie Phe Ser Gly Ala Val lie Leu Leu Glu Phe lie 

660 665 

rCA GAG ATT GCA ATA CCT GTA TTA GGT ACT TTT GCA CTT GTA TCA TAT 2064 
So* Glu lie Ala He Pro Val Leu Gly Thr Phe Ala Leu Val Ser Tyr 

675 680 

ATT GCG AAT AAG GTT CTA ACC GTT CAA ACA ATA GAT AAT GCT TTA ACT 2112 
iS Ala Asn Lys Val Leu Thr Val Gin Thr lie Asp Asn Ala Leu Ser 

690 695 

»n ART GAA AAA TGG GAT GAG GTC TAT AAA TAT ATA GTA ACA AAT 2160 
Lys Arg Asn Glu Lys Trp Asp Glu Val Tyr Lys Tyr He Val Thr Asn 

705 

TGG TTA GCA AAG GTT AAT ACA CAG ATT GAT CTA ATA AGA AAA AAA ATG 2208 
S S Ala Lys Val Asn Thr Gin He Asp Leu lie Arg Lys Lys Met 

mia r»a GCT TTA GAA AAT CAA GCA GAA GCA ACA AAG GCT ATA ATA AAC 2256 
J£ SK S Su Si Asn Gin Ala Glu Ala Thr Lys Ala He He Asn 



740 



TAT CAG TAT AAT CAA TAT ACT GAG GAA GAG AAA AAT AAT ATT AAT TTT 
?S Sn Tyr Asn Gin Tyr Thr Glu Glu Glu Lys Asn Asn He Asn Phe 
755 760 

AAT ATT GAT GAT TTA ACT TCG AAA CTT AAT GAG TCT ATA AAT AAA GCT 
US X Sp Sp Leu Ser Ser Lys Leu Asn Glu Ser He Asn Lys Ala 
770 775 

ATG ATT AAT ATA AAT AAA TTT TTG AAT CAA TGC TCT GTT TCA TAT TTA 
£ S £n lie Asn Lys Phe Leu Asn Gin cys Ser Val Ser Tyr Leu 

785 790 

ATG AAT TCT ATG ATC CCT TAT GGT GTT AAA CGG TTA GAA GAT TTT GAT 
£ £n s£ Set He Pro Tyr Gly Val Lys Arg Leu Glu Asp Phe Asp 
805 

rrr ACT CTT AAA GAT GCA TTA TTA AAG TAT ATA TAT GAT AAT AGA GGA 
SI Ser Leu Lyss Asp Ala Leu Leu Lys Tyr lie Tyr Asp Asn Arg Gly 

820 825 

^ .mm arr CAA GTA GAT AGA TTA AAA GAT AAA GTT AAT AAT ACA 

T £ X S S5 Sn S3 Asp Arg Leu Lys Asp Lys Val Asn Asn Thr 

840 o'ko 



835 



arT » pa rat ATA CCT TTT CAG CTT TCC AAA TAC GTA GAT AAT CAA 
2 |r J£ £p S Pro Phe Gin Leu Ser Lys Tyr Val Asp Asn Gin 



AGA TTA TTA TCT ACA TTT ACT GAA TAT ATT AAG TAA 

Arg 
865 



2304 



2352 



2400 



2448 



2496 



2544 



2592 



2628 



S Leu Leu Ser to Phe Thr Glu Tyr He Lys 
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(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS- 

)Zi anuno acid 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(xaj SEQUENCE DESCRIPTION* SEQ ID NO- 10 
*f - Phe Val Asn Ly 8 Gin Phe ^ ^ ^ ^ ^ ^ ^ ^ 

^ ^ ^ - ^ - - - -5 - - Met Gin Pro 

val Lys - phe Lys ne His r 0 - - n e Pr 3 0 ° Glu ^ 

Asp * Phe Thr AS n Pro G lu Glu Qly ^ ^ ^ « ^ ^ g ^ 

Ala Lys Gin Val Pro Val q*»>- 
65 Val ser Tyr ^ ^ ^ ^ ^ ^ ^ ^ 

Asp Asn Glu Lys Asp Asn Tyr Leu Lys G ly Val Thr r. T 

8S 9$ Vai Thr L vs Leu Phe Glu 

Arg lie Tyr Ser Thr Asp Leu Gly Aro M*.- r 95 

100 7 Jos " et LeU Leu Ser U e Va l 

Arg Gly lie Pro Phe Trp Gly Glv smr- «, 

US ^ «J S « Thr He Asp Thr Glu Leu Lys 

Val He Asp Thr Asn Cys He Asn v =1 Tn 

"0 ^ g| lie Gin Pro Asp Gly Ser Tyr 

Arg ser Glu Glu Leu Asn Leu Val il* T1 

145 150 U Val Ile Ile gy Pro Ser Ala Asp n e 

lie Gin Phe Glu Cys Lvs Ser M 160 

- - «, gj c ly ser ^ 01n ^ „. ^ ^ ^ ^ - ^ 

~ - S «. «. « lu s „ a 01u val ^ ^ ^ tou ^ 

-T «. *, ly8 «. Ma „ ^ ^ ^ ^ 2 AXa M8 Slu 

a u. ». Ma oay g, ^ ^ Qiy ^ ^ ^ ^ ^ 

Arg Val Phe Lys Val Asn Thr Asn Ala Tvr ^ „, **° 
245 ^n Ala Tyr Tyr Glu Met Ser Gly Leu 

Glu Val Ser Phe Glu Glu Leu Arg Thr Phe ai „i 

260 9 Sf PhS Gly His Asp Ala Lys 

Phe He Asp Ser Leu Gin Glu Asn m„ , 

275 zS " ^ ^ LeU Tyr Tyr Tyr Asn 

285 

Lys Phe Lys Asp lie Ala s^r- t>,~ t 

290 P Ma f|f Thr Leu Asn Lys Ala Lys Ser ll e Val 
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Gly Thr Thr Ala Ser Leu Gin Tyr Met Lys Asn Val Phe Lys Glu Lys 
3^ 310 315 "U 

Tvr Leu Leu Ser Glu Asp Thr Ser Gly Lys Phe Ser Val Asp Lys Leu 
325 330 JJS 

hys Phe Asp Lys Leu Tyr Lys Met Leu Thr Glu lie Tyr Thr Glu Asp 

340 

Asn Phe Val Lys Phe Phe Lys Val Leu Asn Arg Lys Thr Tyr Leu Asn 

355 360 
Phe Asp Lys Ala Val Phe Lys lie Asn lie Val Pro Lys Val Asn Tyr 

Thr lie Tyr Asp Gly Phe Asn Leu Arg Asn Thr Asn Leu Ala Ala Asn 
385 390 

Phe Asn Gly Gin Asn Thr Glu lie Asn Asn Met Asn Phe Thr Lys Leu 



405 



Lys Asn Phe Thr Gly Leu Phe Glu Phe Tyr Lys Leu Leu Cys Val Arg 

420 

Gly He lie Thr Ser Lys Thr Lys Ser Leu Asp Lys Gly Tyr Asn Lys 

Ser Ala Asp Gly Ala Leu Asn Asp Leu Cys lie Lys Val Asn Asn Trp 

450 455 
Asp Leu Phe Phe Ser Pro Ser Glu Asp Asn Phe Thr Asn Asp Leu Asn 



465 



Lys Gly Glu Glu lie Thr Ser Asp Thr Asn He Glu Ala Ala Glu Glu 

Asn lie Ser Leu Asp Leu lie Gin Gin Tyr Tyr Leu Thr Phe Asn Phe 

500 505 
Asp Asn Glu Pro Glu Asn He Ser lie Glu Asn Leu Ser Ser Asp He 

515 520 5Z:> 

lie Gly Gin Leu Glu Leu Met Pro Asn He Glu Arg Phe Pro Asn Gly 

Lys Lys Tyr Glu Leu Asp Lys Tyr Thr Met Phe His Tyr Leu Arg Ala 
545 550 

Gin Glu Phe Glu His Gly Lys Ser Arg He Ala Leu Thr Asn Ser Val 

Asn Glu Ala Leu Leu Asn Pro Ser Arg Val Tyr Thr Phe-Phe Ser Ser 

580 585 
Asp Tyr Val Lys Lys Val Asn Lys Ala Thr Glu Ala Ala Met Phe Leu 

Gly Trp Val Glu Gin Leu Val Tyr Asp Phe Thr Asp Glu Thr Ser Glu 
610 615 

He Ala Asp He Thr He He He Pro Tyr 

<JS ~*' **** - * f<jri 

625 



Val Ser Thr Thr Asp Lys He Ala Asp ixe x« ^ - — — 



He Gly Pro Ala Leu Asn He Gly Asn Met Leu Tyr Lys Asp Asp Phe 



645 
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Val Gly Ala Leu Ile phe 

660 Gly Ala val He Leu Leu Qlu ^ ^ 

Pro Glu lie Ala lie Pro Val Leu Gly Thr ph 

675 680 Y Thr Phe a Leu Val Ser Tyr 

lie Ala Asn Lys Val Leu Thr w,i 

Thr Val G i n Thr ^ ^ ^ ^ ^ 

Lys Arg Asn Glu Lys Trp Asp Glu Val Tvr r„ 
05 710 ^ £f ^ r Val Thr Asn 

Trp Leu Ala Lys Val Asn Thr Gin a 720 

He Asp Leu n e ^ Lyg Lys ^ 

Lys Glu Ala Leu Glu Asn Gin Ala oi„ >i ?35 

Ala Glu Ala Thr Lys Ala ^ ^ ^ 

^ S ^ - - - -J - Olu L y S Asn Asn 2 Asn Phe 

- Hj Asp Asp Leu ser Ser Lys Leu Asn ^ ^ ^ ^ ^ ^ 

Met lie Asn He Asn Lys Phe Leu Asn Gin n, 

785 ™> ASn Gln g» Ser Val Ser Tyr Leu 

Met Asn Ser Met lie Pro Tyr Gly Val Lv, » 8 °° 
80S r y Val Lys Arg Leu Glu Asp Phe ^ 

Ma ser leu a - M * - - a - - * Z „ 
- - Sl «. ™ „ a ^ lya w ^ «• ^ ^ 

Leu Ser Thr Asp He Pro Phe Gin t 

855 ^ s Val Asp Asn Gin 

Arg Leu Leu Ser Thr Phe Tfar Glu Tyr l ie Lys . 

875 

(2) INFORMATION FOR SBQ ID NO: 1 X: 

(±) SEQUENCE CHARACTERISTICS • 

2 J^™ 1 2637 ba^e pairs 
(B TYPE: nucleic acid 

(C STRANDEDNESS: double 
(D) TOPOLOGY: linear 

(ii) MOLECULE _TYPE : DNA (genomic) 
(ix) FEATURE : 

(A) NAME/KEY: CDS 

(B) LOCATION: 1. .2637 

<xi) SEQUENCE DESCRIPTION: SEQ XD N0 . lx 



48 



96 
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GTG AAG GCT TTC AAG ATT CAT AAC AAA ATC TGG GTT ATT CCG GAA CGC 144 
Sal £v"s Ala Phe Lys lie His Asn Lys He Trp Val lie Pro Glu Arg 
' 35 40 45 

GAT ACA TTT ACG AAC CCG GAA GAA GGA GAC TTG AAC CCG CCG CCG GAA 
Asp Thr Phe Thr Asn Pro Glu Glu Gly Asp Leu Asn Pro Pro Pro Glu 

50 55 
„™ aar TAG GTG CCA GTT TCA TAC TAC GAT TCA ACC TAT CTG AGC ACA 
S 5s £n S3 Pro Val Ser Tyr Tyr Asp Ser Thr Tyr Leu Ser Thr 

65 70 
„ r ,. r GAG aaG GAT AAC TAC CTG AAG GGA GTG ACC AAA TTA TTC GAG 
J£ £n Glu JJ5 Asp Asn Tyr Leu Lys Gly Val Thr Lys Leu Phe Glu 

PGT ATT TAT TCC ACT GAC CTG GGC CGT ATG CTG CTG ACC TCA ATC GTC 
m l¥e Tvr Ser Thr Asp Leu Gly Arg Met Leu Leu Thr Ser lie Val 
Arg lie iyr ^ 105 HO 

aTC CCA TTT TGG GGT GGC AGT ACC ATT GAC ACG GAG TTG AAG 
Arg 2J ne Pro Phe Trp Gly Gly Ser Thr He Asp Thr Glu Leu Lys 
Arg u±y ^ 12Q 12 5 

PTT ATT GAC ACT AAC TGC ATT AAC GTG ATC CAA CCA GAC GGT AGC TAC 
Sal i™ Sp S Asn Cys lie Asn Val He Gin Pro Asp Gly Ser Tyr 

130 135 
ACA TCT GAA GAA CTT AAC CTC GTA ATC ATC GGG CCC TCC GCG GAC ATT 
Arg Ser Glu Glu Leu Asn Leu Val He He Gly Pro Ser Ala Asp lie 
1 4 | 150 155 

1Tr r AG TTT GAG TGC AAG AGC TTT GGC CAC GAA GTG TTG AAC CTG ACG 
ile £n P^ Cys Lys Ser Phe Gly His Glu Val Leu Asn Leu Thr 

CGT AAC GGT TAC GGC TCT ACT CAG TAC ATT CGT TTC AGC CCA GAC TTC 
S itn Tyr Gly Ser Thr Gin Tyr lie Arg Phe Ser Pro Asp Phe 
X80 185 

AOS TTC GGT TTC GAG GAG AGC CTG GAG GTT GAT ACC AAC CCG CTG TTG 
S S 2S Glu Glu Ser Leu Glu Val Asp Thr Asn Pro Leu Leu 

195 200 zud 

RPT GCA GGC AAG TTC GCA ACT GAT CCA GCG GTG ACC CTG GCA CAC GAG 
Gly Ala Gly Lys Phe Ala Thr Asp Pro Ala Val Thr Leu Ala His Glu 

210 215 
^, r . r err GGT CAT CGT CTG TAT GGC ATT GCG ATT AAC CCG AAC 

Su S Ss Sa Sy £s Arg Leu Tyr Gly He Ala He Asn Pro Asn 
225 230 235 

pee GTG TTC AAG GTT AAC ACC AAC GCC TAC TAC GAG ATG AGT GGT TTA 
S Sal iy"s Sal Asn Thr Asn Ala Tyr Tyr Glu Met Ser Gly Leu 

2.45 



„„ fyriT, AGC TTC GAG GAA CTG CGC ACG TTC GGT GGC CAT GAT GCG AAG 
g£ SS Kr pS Su Glu Leu Arg Thr Phe Gly Gly Hi. Asp Ala Lys 
260 265 
rftP AGC TTG CAG GAG AAC GAG TTC CGT CTG TAC TAC TAC AAC 
He Sp sir £u £n Glu Asn Glu Phe Arg Leu Tyr Tyr Tyr Asn 
275 280 
*AC TTT AAA GAT ATT GCA AGT ACA CTG AAC AAG GCT AAG TCC ATT GTG 
Lys Phe Lys Asp He Ala Ser Thr Leu Asn Lys Ala Lys Ser He Val 



192 



240 



288 



336 



384 



432 



480 



528 



576 



624 



672 



720 



768 



816 



864 



912 
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GGT ACC ACT GCT TCA TTA CAG TAT ATG AAA AAT GTT TTT AAA GAG AAA 960 
Gly Thr Thr Ala Ser Leu Gin Tyr Met Lys Asn Val Phe Lys Glu Lys 
305 310 315 320 

TAT CTC CTA TCT GAA GAT ACA TCT GGA AAA TTT TCG GTA GAT AAA TTA 1008 
Tyr Leu Leu Ser Glu Asp Thr Ser Gly Lys Phe Ser Val Asp Lys Leu 
1 325 330 335 

AAA TTT GAT AAG TTA TAG AAA ATG TTA ACA GAG ATT TAC ACA GAG GAT 1056 
Lvs Phe Asp Lys Leu Tyr Lys Met Leu Thr Glu lie Tyr Thr Glu Asp 
7 340 345 350 

AAT TTT GTT AAG TTT TTT AAA GTA CTT AAC AGA AAA ACA TAT TTG AAT 1104 
Asn Phe Val Lys Phe Phe Lys Val Leu Asn Arg Lys Thr Tyr Leu Asn 
355 360 365 

TTT GAT AAA GCC GTA TTT AAG ATA AAT ATA GTA CCT AAG GTA AAT TAC 1152 
Phe Asp Lys Ala Val Phe Lys He Asn He Val Pro Lys Val Asn Tyr 
370 375 380 

ACA ATA TAT GAT GGA TTT AAT TTA AGA AAT ACA AAT TTA GCA GCA AAC 1200 
Thr He Tyr Asp Gly Phe Asn Leu Arg Asn Thr Asn Leu Ala Ala Asn 
385 390 395 400 

TTT AAT GGT CAA AAT ACA GAA ATT AAT AAT ATG AAT TTT ACT AAA CTA 1248 
Phe Asn Gly Gin Asn Thr Glu lie Asn Asn Met Asn Phe Thr Lys Leu 
405 410 415 

AAA AAT TTT ACT GGA TTG TTT GAA TTT TAT AAG TTG CTA TGT GTA AGA 1296 
Lys Asn Phe Thr Gly Leu Phe Glu Phe Tyr Lys Leu Leu Cys Val Arg 
. 420 425 430 

GGG ATA ATA ACT TCT AAA ACT AAA TCA TTA GAT AAA GGA TAC AAT AAG 1344 
Gly He He Thr Ser Lys Thr Lys Ser Leu Asp Lys Gly Tyr Asn Lys 
435 440 445 

ATC GAA GGT CGT TGC GAT GGG GCA TTA AAT GAT TTA TGT ATC AAA GTT 1392 
He Glu Gly Arg Cys Asp Gly Ala Leu Asn Asp Leu Cys He Lys Val 
450 455 460 

AAT AAT TGG GAC TTG TTT TTT AGT CCT TCA GAA GAT AAT TTT ACT AAT 1440 
Asn Asn Trp Asp Leu Phe Phe Ser Pro Ser Glu Asp Asn Phe Thr Asn 
465 470 475 480 

GAT CTA AAT AAA GGA GAA GAA ATT ACA TCT GAT ACT AAT ATA GAA GCA 1488 
Asp Leu Asn Lys Gly Glu Glu He Thr Ser Asp Thr Asn He Glu Ala 
485 490 495 

GCA GAA GAA AAT ATT AGT TTA GAT TTA ATA CAA CAA TAT TAT TTA ACC 1536 
Ala Glu Glu Asn He Ser Leu Asp Leu He Gin Gin Tyr Tyr Leu Thr 
500 505 510 

TTT AAT TTT GAT AAT GAA CCT GAA AAT ATT TCA ATA GAA AAT CTT TCA 1584 
Phe Asn Phe Asp Asn Glu Pro Glu Asn He Ser He Glu Asn Leu Ser 
515 520 525 

AGT GAC ATT ATA GGC CAA TTA GAA CTT ATG CCT AAT ATA GAA AGA TTT 1632 
Ser Asp He He Gly Gin Leu Glu Leu Met Pro Asn He Glu Arg Phe 
530 535 540 

CCT AAT GGA AAA AAG TAT GAG TTA GAT AAA TAT ACT ATG TTC CAT TAT 1680 
Pro Asn Gly Lys Lys Tyr Glu Leu Asp Lys Tyr Thr Met Phe His Tyr 
545 550 555 560 

CTT CGT GCT CAA GAA TTT GAA CAT GGT AAA TCT AGG ATT GCT TTA ACA 1728 
Leu Arq Ala Gin Glu Phe Glu His Gly Lys Ser Arg lie Ala Leu Thr 
565 570 575 
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TV nr. raa RCA TTA TTA AAT CCT AGT CGT GTT TAT ACA TTT 

{£ E ?S ~ S 2 2u ffi Pro ser *, val T^j T„ Ph. 

580 SH& 

GAC TAT GTA AAG AAA GTT AAT AAA GCT ACG GAG GCA GOT 
Sp Tyr Val Lys Lys Val Asn Lys Ala Thr Glu Ala Ala 

CGC TGG GTA GAA CAA TTA GTA TAT GAT TTT ACC GAT GAA 
S5 Trp val Glu Gin Leu Val Tyr Asp Phe Thr Asp Glu 
615 620 

GTA AGT ACT ACG GAT AAA ATT GCG GAT ATA ACT ATA ATT 
?S J2 Thr Thr Asp Lys lie Ala Asp He Thr lie lie 
630 635 

VTA GGA CCT GCT TTA AAT ATA GGT AAT ATG TTA TAT AAA 
ni S5 Pro Ala Leu Asn lie Oly Asn Met Leu Tyr Lys 
645 650 

GGT GCT TTA ATA TTT TCA GGA GCT GTT ATT CTG TTA 
Ty Sa Leu lie Phe Ser Gly Ala Val lie Leu Leu 
665 670 

GAG ATT GCA ATA CCT GTA TTA GGT ACT TTT GCA CTT 
SS lie Ala lie Pro Val Leu Gly Thr Phe Ala Leu 

680 685 

GCG AAT AAG GTT CTA ACC GTT CAA ACA ATA GAT AAT 
Sa 2n Lys Val Leu Thr Val Gin Thr lie Asp Asn 
695 700 

AGA AAT GAA AAA TGG GAT GAG GTC TAT AAA TAT ATA 
JS !£n 2K Lys Trp Asp Glu Val Tyr Lys Tyr lie 
710 715 

TTA GCA AAG GTT AAT ACA CAG ATT GAT CTA ATA AGA 
Leu Ala Lys Val Asn Thr Gin lie Asp Leu lie Arg 
725 730 



TTT TCT TCA 
Phe Ser Ser 
595 

ATG TTT TTA 
Met Phe Leu 
610 

ACT AGC GAA 
Thr Ser Glu 
625 

ATT CCA TAT 
He Pro Tyr 



GAT GAT TTT GTA 
Asp Asp Phe Val 
660 

GAA TTT ATA CCA 
Glu Phe He Pro 
675 

GTA TCA TAT ATT 
Val Ser Tyr He 
690 

GCT TTA AGT AAA 
Ala Leu Ser Lys 
705 

GTA ACA AAT TGG 
Val Thr Asn Trp 

AAA AAA ATG AAA 
Lys Lys Met Lys 
1 740 

ATA ATA AAC TAT 
He He Asn Tyr 
755 

ATT AAT TTT AAT 
He Asn Phe Asn 
770 

AAT AAA GCT ATG 
Asn Lys Ala Met 
785 

TCA TAT TTA ATG 
Ser Tyr Leu Met 



-GAT TTT GAT GCT 
Asp Phe Asp Ala 
820 

AAT AGA GGA ACT 
Asn Arg Gly Thr 
835 



GAA GCT TTA GAA AAT CAA GCA GAA GCA ACA AAG GCT 
S£ Ala Leu Glu Asn Gin Ala Glu Ala Thr Lys Ala 
745 750 

CAG TAT AAT CAA TAT ACT GAG GAA GAG AAA AAT AAT 
Sn Sr Asn Gin Tyr Thr Glu Glu Glu Lys Asn Asn 
760 765 

ATT GAT GAT TTA AGT TCG AAA CTT AAT GAG TCT ATA 
lie Asp Asp Leu Ser Ser Lys Leu Asn Glu Ser He 
775 780 

ATT AAT ATA AAT AAA TTT TTG AAT CAA TGC TCT GTT 
"X Jt- Asn Lys Phe Leu Asn Gin Cys Ser Val 
790 795 

AAT TCT ATG ATC CCT TAT GGT GTT AAA CGG TTA GAA 
sS Me? lie Pro Tyr Gly Val Lys Arg Leu Glu 
805 810 

AGT CTT AAA GAT GCA TTA TTA AAG TAT ATA TAT GAT 
Ser Leu Lys Asp Ala Leu Leu Lys Tyr He Tyr Asp 
825 

TTA ATT GGT CAA GTA GAT AGA TTA AAA GAT AAA GTT 
25 lie Gly Gin Val Asp Arg Leu Lys Asp Lys Val 
840 845 



1776 



1824 



1872 



1920 



1968 



2016 



2064 



2112 



2160 



2208 



2256 



2304 



2352 



2400 



2448 



2496 



2544 
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£n Asn g Z £ g «* g* CCT TTT CAG CTT XCC AAA T 

Asp He Pro Phfi Qln gT XCC AAA TAC GTA 

GAT AAT CAA AGA TTA TTA w-n - 

» - «■ ** - 2 E S S S SS £ S £ ^ 

... 875 r 

(2) INFORMATION FOR SEQ ID NO: « . 

!°: xrp B: anuno acid 
(D) TOPOLOGY : Une ^ , 

(ii) MOLECULE TYPE: protein 

(») SEQUENCE WSOiSrSS^ ID NQ . 

Met Gin Phe Val Asn Lys Qln Ph„ a 

Phe Asn Tyr Lys Asp prQ ^ ^ ^ 

Val Asp He Ala Tyr He L V<a T1 15 

20 » L YS lie P f| Asn Ala Gl y G l n Met Gln Pro 

Val Lys Ala Phe Lys n e His Asn Lys He Tm v , 

35 40 ■ " e ^ Val He Pro Glu Arg 

Asp Thr Phe Thr Asn P ro Glu " 

55 Xy Leu Asn Pro Pro Pro Glu 

Ala Lys Gin Val Pro Val Ser Tvr TW a 
" 70 r ^ *** Ser Thr Tyr. Leu Ser Thr 

ASP Asn Glu Lys As p Asn Tyr Leu Lys Gly Val Thr l , ' ^ 

85 9$ ai Thr Leu Phe Glu 

*■ - ^ j. * „ Leu 01y ^ ^ Lm ^ ^ ^ ^ ^ 

135 ** Ue G1 » «J Asp Gly S« ^ 

a - «. - „ u s teu v , Ile iie a ^ ser ^ ^ 
. a "— i - -a-— «.«.« r 

Arg Asn Gly Tyr Gly Ser Thr Gln 175 

180 Ig 6 Phe S « fc» Asp Phe 

35 200 ^ P Thr Asn Pro Leu Leu 

Gly Ala Gly Lys Phe Ala Thr » 

«a g Asp „ Wa yal ^ uu ^ ws ^ 

J~ II. Hi. Ala 0ly » le ^ ^ 

230 y ^| Ala He Asn Pro Asn 

Arg Val Phe Lys Val Asn Thr- a , 240 
24 l Asn Thr Asn Ala Tyr Tyr Glu Met Ser Gly Leu 



2592 
2637 
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Glu val ser Phe Glu Glu Leu Arg Thr Phe Gly Gly His Asp Ala Lys 

260 265 



Phe He Asp Ser Leu Gin Glu Asn Glu Phe Arg Leu Tyr Tyr Tyr Asn 

275 280 
Lys Phe Lys Asp lie Ala Ser Thr Leu Asn Lys Ma Lys Ser lie Val 

290 295 
Oly Thr Thr Ala Ser Leu Gin Tyr Met Lys Asn Val Phe Lys Glu Lys 
305 310 

^ Le u Leu Ser Glu Asp Thr Ser Gly Lys Phe Ser Val Asp Lys Leu 
Lys Phe Asp Lys Leu Tyr Lys Met Leu Thr Glu He Tyr Thr Glu Asp 



340 

Asn Phe Val Lys Phe Phe Lys Val Leu Asn Arg Lys Thr Tyr Leu Asn 

355 350 
Phe Asp Lys Ala Val Phe Lys He Asn He Val Pro Lys Val Asn Tyr 

Thr lie Tyr Asp Gly Phe Asn Leu Arg Asn Thr Asn Leu Ala Ala Asn 
385 390 

Phe Asn Gly Gin Asn Thr Glu He Asn Asn Met Asn Phe Thr Lys Leu 



420 



405 

Lys Asn Phe Thr Gly Leu Phe Glu Phe Tyr Lys Leu Leu Cys Val Arg 



Oly He He Thr Ser Lys Thr Lys Ser Leu Asp Lys Gly Tyr Asn Lys 

* 435 440 



He Glu Gly Arg Cys Asp Gly Ala Leu Asn Asp Leu Cys He Lys val 

450 455 
Asn Asn Trp Asp Leu Phe Phe Ser Pro Ser Glu Asp Asn Phe Thr Asn 
465 470 

Asp Leu Asn Lys Gly Glu Glu He Thr Ser Asp Thr Asn He Glu Ala 

Ala Glu Glu Asn He Ser L*u Asp Leu He Gin Gin Tyr Tyr Leu Thr 

500 505 
Phe Asn Phe Asp Asn Glu Pro Glu Asn He Ser He Glu Asn Leu Ser 

5X5 520 
Ser Asp He He Gly Gin Leu Glu Leu Met Pro Asn He Glu Arg Phe 

Pro to Gly Lys Lys Tyr Glu Leu Asp Lys Tyr Thr Met Phe His Tyr 

545 550 

,eu Arg Ala Gin Glu Phe Glu His Gly Lys Ser Arg He Ala Leu Thr 

Asn ser Val Asn Glu Ala Leu Leu Asn Pro Ser Arg Val Tyr Thr Phe 

580 585 
Phe Ser Ser Asp Tyr Val Lys Lys Val Asn Lys Ala Thr Glu Ala Ala 



595 
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Met Phe Leu Gly Trp Val Glu Gin Leu Val Tvr a 

615 ,, Val ^ Asp Phe Thr Asp Glu 

Thr Ser Glu Val Ser Thr Th^ » 

Thr Thr Asp Lys ne ^ iifi ^ ^ 

He Pro Tyr He Gly Pro Ala Ley 640 

So ASn Met Tyr Lys 

Asp Asp Phe Val Gly Ala Leu i le Ph » c 

He Phe Ser Gly Ala Val I1# Leu Leu 

~ s - - «. - a « , u Ihr val n= asp ^ 

Ala Leu Ser Lys Axg Asn Glu Lys Tm a „, 

705 ™ ^ ^ ^ g« Va * Tyr Lys Tyr Ile 

Val Thr Asn Trp Leu Ala Lys Val Asn Thr n ^ 
725 Asn Thr Gin n. Asp Leu Jle ^ 

Lys Lys Met Lys Glu Ala Leu Glu As „ G l„ m, 

740 745 ln Ma G1U Ala ^r Lys Ala 

^ _ s « ly , Glu _ ne 

Asn Lys Ala Met lie Asn ti« » 

785 79? ASn Lys Phe ^u Asn Gin n » 

790 Asn Gin Cys Ser Val 

Ser Tyr Leu Met Asn Ser Met n e Prn „ 800 

Pro Tyr Gly Val Lys ^ Leu ^ 

Asp Phe Asp Ala Ser Leu Lys A sd ai, t 

Y Asp Ala Leu Leu Lys Tyr Ile ^ Agp 

Asn Arg jly Thr Leu lie Gly Gin Val Asp Ara r 

835 840 P ^ Leu Jys Asp Lys Val 

Asn Asn Thr Leu Ser Thr Asn tt- n 

850 ASn I le Pro Ph*> (-1,, t 

855 r ° Phe Gln Leu Ser Lys Ty r Va i 

Asp Asn Gin Arg Leu Leu Ser Thr Phe Thr M 
865 870 r Phe Thr Tyr n e Lys * 

875 

(2) INFORMATION FOR SEQ ID NO: l 3: 

(i) SEQUENCE CHARACTERISTICS- 
(A) LENGTH: 28S2 has* t~ ■ 
W TYPE: nucleic al^ 1 " 
C STRANDEDNESS: double 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DMA (genomic) 
(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION :1. .2862 

(Xi) SEQUENCE DESCRIPTION: SEQ I D N0 , 13 , 
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ATG CAG TTC GTG AAC AAG CAG TTC AAC TAT AAG GAC CCT GTA AAC GGT 48 
Met Gin Phe Val Asn Lys Gin Phe Asn Tyr Lys Asp Pro Val Asn Gly 
1 5 10 15 

GTT GAC ATT GCC TAC ATC AAA ATT CCA AAC GCC GGC CAG ATG CAG CCG 96 
Val Asp He Ala Tyr He Lys He Pro Asn Ala Gly Gin Met Gin Pro 
20 25 30 

GTG AAG GCT TTC AAG ATT CAT AAC AAA ATC TGG GTT ATT CCG GAA CGC 144 
Val Lys Ala Phe Lys He His Asn Lys He Trp Val lie Pro Glu Arg 
35 40 45 

GAT ACA TTT ACG AAC CCG GAA GAA GGA GAC TTG AAC CCG CCG CCG GAA 192 
Asp Thr Phe Thr Asn Pro Glu Glu Gly Asp Leu Asn Pro Pro Pro Glu 
50 55 60 

CPA AAG CAG GTG CCA GTT TCA TAC TAC GAT TCA ACC TAT CTG AGC ACA 240 
Ala Lys Gin Val Pro Val Ser Tyr Tyr Asp Ser Thr Tyr Leu Ser Thr 
65 70 7 ^ 

GAC AAC GAG AAG GAT AAC TAC CTG AAG GGA GTG ACC AAA TTA TTC GAG 288 
Asp Asn Glu Lys Asp Asn Tyr Leu Lys Gly Val Thr Lys Leu Phe Glu 
* 85 90 95 

CGT ATT TAT TCC ACT GAC CTG GGC CGT ATG CTG CTG ACC TCA ATC GTC 336 
Arci He Tyr Ser Thr Asp Leu Gly Arg Met Leu Leu Thr Ser lie Val 
y 100 X0S HO 

rrr GGA ATC CCA TTT TGG GGT GGC AGT ACC ATT GAC ACG GAG TTG AAG 384 
Sq Sy lie Pro Phe Trp Gly Gly Ser Thr He Asp Thr Glu Leu Lys 
3 1 120 125 



115 



GTT ATT GAC ACT AAC TGC ATT AAC GTG ATC CAA CCA GAC GGT AGC TAC 432 
Val lie Asp Thr Asn Cys He Asn Val He Gin Pro Asp Gly Ser Tyr 
130 135 140 

AGA TCT GAA GAA CTT AAC CTC GTA ATC ATC GGG CCC TCC GCG GAC ATT 480 
Arg Ser Glu Glu Leu Asn Leu Val He He Gly Pro Ser Ala Asp He 
14 | 150 155 "0 

ATC CAG TTT GAG TGC AAG AGC TTT GGC CAC GAA GTG TTG AAC CTG ACG 528 
He Gin Phe Glu Cys Lys Ser Phe Gly His Glu Val Leu Asn Leu Thr 
165 170 1'5 

CGT AAC GGT TAC GGC TCT ACT CAG TAC ATT CGT TTC AGC CCA GAC TTC 576 
Arg Asn Gly Tyr Gly Ser Thr Gin Tyr He Arg Phe Ser Pro Asp Phe 
180 l fl5 19 

ACG TTC GGT TTC GAG GAG AGC CTG GAG GTT GAT ACC AAC CCG CTG TTG 624 
Gly 
195 



ace TTC GGT TTC GAG GAG A<SU VaAH Viii u«x ~»v- — - — 

Thr Phe Gly Phe Glu Glu Ser Leu Glu Val Asp Thr Asn Pro Leu Leu 
195 200 205 

GGT GCA GGC AAG TTC GCA ACT GAT CCA GCG GTG ACC CTG GCA CAC GAG 672 
Gly Ala Gly Lys Phe Ala Thr Asp Pro Ala Val Thr Leu Ala His Glu 



210 215 220 

CTG ATC CAC GCC GGT CAT CGT CTG TAT CGC ATT GCG ATT AAC CCG AAC 720 
Su He 2s Sa Gly His Arg Leu Tyr Gly He Ala He Asn Pro Asn 
225 230 235 

CGC GTG TTC AAG GTT AAC ACC AAC GCC TAC TAC GAG ATG AGT GGT TTA 768 
Arg val Phe Lys Val Asn Thr Asn Ala Tyr Tyr Glu Met Ser Gly Leu 
3 245 250 •"=» 

GAA GTA AGC TTC GAG GAA CTG CGC ACG TTC GGT GGC CAT GAT GCG AAG 816 
gJu Val Ser Phe Glu Glu Leu Arg Thr Phe Gly Gly His Asp Ala Lys 
260 265 ^ 
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TTT ATC GAC AGC TTG CAG GAG AAC GAG TTC CGT CTG TAC TAC TAC AAC 864 
Phe lie Asp Ser Leu Gin Glu Asn Glu Phe Arg Leu Tyr Tyr Tyr Asn 
. 275 280 285 

AAG TTT AAA GAT ATT GCA AGT ACA CTG AAC AAG GCT AAG TCC ATT GTG 912 
Lys Phe Lys Asp lie Ala Ser Thr Leu Asn Lys Ala Lys Ser lie Val 
290 295 300 

GGT ACC ACT GCT TCA TTA CAG TAT ATG AAA AAT GTT TTT AAA GAG AAA 960 
Gly Thr Thr Ala Ser Leu Gin Tyr Met Lys Asn Val Phe Lys Glu Lys 
305 310 315 320 

TAT CTC CTA TCT GAA GAT ACA TCT GGA AAA TTT TCG GTA GAT AAA TTA 1008 
Tvr Leu Leu Ser Glu Asp Thr Ser Gly Lys Phe Ser Val Asp Lys Leu 
* 325 1 330 335 

AAA TTT GAT AAG TTA TAC AAA ATG TTA ACA GAG ATT TAC ACA GAG GAT 1056 
Lvs Phe Asp Lys Leu Tyr Lys Met Leu Thr Glu lie Tyr Thr Glu Asp 
340 345 350 

AAT TTT GTT AAG TTT TTT AAA GTA CTT AAC AGA AAA ACA TAT TTG AAT 1104 
Asn Phe Val Lys Phe Phe Lys Val Leu Asn Arg Lys Thr Tyr Leu Asn 
355 360 365 

TTT GAT AAA GCC GTA TTT AAG ATA AAT ATA GTA CCT AAG GTA AAT TAC 1152 
Phe Asp Lys Ala Val Phe Lys lie Asn lie Val Pro Lys Val Asn Tyr 
370 375 380 

ACA ATA TAT GAT GGA TTT AAT TTA AGA AAT ACA AAT TTA GCA GCA AAC 1200 
Thr lie Tyr Asp Gly Phe Asn Leu Arg Asn Thr Asn Leu Ala Ala Asn 
3 85 390 395 400 

TTT AAT GGT CAA AAT ACA GAA ATT AAT AAT ATG AAT TTT ACT AAA CTA 1248 
Phe Asn Gly Gin Asn Thr Glu lie Asn Asn Met Asn Phe Thr Lys Leu 
405 410 415 

AAA AAT TTT ACT GGA TTG TTT GAA TTT TAT AAG TTG CTA TGT GTA AGA 1296 
Lvs Asn Phe Thr Gly Leu Phe Glu Phe Tyr Lys Leu Leu Cys Val Arg 
420 425 430 

GGG ATA ATA ACT TCT AAA ACT AAA TCA TTA GAT AAA GGA TAC AAT AAG 1344 
Gly He lie Thr Ser Lys Thr Lys Ser Leu Asp Lys Gly Tyr Asn Lys 
435 440 445 

ATC GAA GGT CGT TGC GAT GGG GCA TTA AAT GAT TTA TGT ATC AAA GTT 1392 
He Glu Gly Arg Cys Asp Gly Ala Leu Asn Asp Leu Cys He Lys Val 
450 455 460 

AAT AAT TGG GAC TTG TTT TTT AGT CCT TCA GAA GAT AAT TTT ACT AAT 1440 
Asn Asn Trp Asp Leu Phe Phe Ser Pro Ser Glu Asp Asn Phe Thr Asn 
465 470 475 480 

GAT CTA AAT AAA GGA GAA GAA ATT ACA TCT GAT ACT AAT ATA GAA GCA 1488 
Asp Leu Asn Lys Gly Glu Glu He Thr Ser Asp Thr Asn He Glu Ala 
485 490 495 

GCA GAA GAA AAT ATT AGT TTA GAT TTA ATA CAA CAA TAT TAT TTA ACC 1536 
Ala Glu Glu Asn He Ser Leu Asp Leu He Gin Gin Tyr Tyr Leu Thr 
500 505 510 

TTT AAT TTT GAT AAT GAA CCT GAA AAT ATT TCA ATA GAA AAT CTT TCA 1584 
Phe Asn Phe Asp Asn Glu Pro Glu Asn He Ser He Glu Asn Leu Ser 
515 520 525 

AGT GAC ATT ATA GGC CAA TTA GAA CTT ATG CCT AAT ATA GAA AGA TTT 1632 
Ser Asp He lie Gly Gin Leu Glu Leu Met Pro Asn He Glu Arg Phe 
530 535 540 
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CCT AAT GGA AAA AAG TAT GAG TTA GAT AAA TAT ACT ATG TTC CAT TAT 1680 

Pro Asn Gly Lys Lys Tyr Glu Leu Asp Lys Tyr Thr Met Phe His Tyr 
545 550 555 560 

CTT CGT GCT CAA GAA TTT GAA CAT GGT AAA TCT AGG ATT GCT TTA ACA 1728 
Leu Arg Ala Gin Glu Phe Glu His Gly Lys Ser Arg He Ala Leu Thr 
565 570 575 

AAT TCT GTT AAC GAA GCA TTA TTA AAT CCT AGT CGT GTT TAT ACA TTT 1776 
Asn Ser Val Asn Glu Ala Leu Leu Asn Pro Ser Arg Val Tyr Thr Phe 
580 585 590 

TTT TCT TCA GAC TAT GTA AAG AAA GTT AAT AAA GCT ACG GAG GCA GCT 1824 
Phe Ser Ser Asp Tyr Val Lys Lys Val Asn Lys Ala Thr Glu Ala Ala 
595 600 605 

ATG TTT TTA GGC TGG GTA GAA CAA TTA GTA TAT GAT TTT ACC GAT GAA 1872 
Met Phe Leu Gly Trp Val Glu Gin Leu Val Tyr Asp Phe Thr Asp Glu 
610 615 620 

ACT AGC GAA GTA AGT ACT ACG GAT AAA ATT GCG GAT ATA ACT ATA ATT 1920 
Thr Ser Glu Val Ser Thr Thr Asp Lys He Ala Asp He Thr He He 
625 630 635 640 

ATT CCA TAT ATA GGA CCT GCT TTA AAT ATA GGT AAT ATG TTA TAT AAA 1968 
He Pro Tyr He Gly Pro Ala Leu Asn He Gly Asn Met Leu Tyr Lys 
645 650 655 

GAT GAT TTT GTA GGT GCT TTA ATA TTT TCA GGA GCT GTT ATT CTG TTA 2016 
Asp Asp Phe Val Gly Ala Leu He Phe Ser Gly Ala Val He Leu Leu 
660 665 670 

GAA TTT ATA CCA GAG ATT GCA ATA CCT GTA TTA GGT ACT TTT GCA CTT 2064 
Glu Phe lie Pro Glu He Ala He Pro Val Leu <51y Thr Phe Ala Leu 
675 680 685 

GTA TCA TAT ATT GCG AAT AAG GTT CTA ACC GTT CAA ACA ATA GAT AAT 2112 
Val Ser Tyr He Ala Asn Lys Val Leu Thr Val Gin Thr He Asp Asn 
690 695 700 

GCT TTA AGT AAA AGA AAT GAA AAA TGG GAT GAG GTC TAT AAA TAT ATA 2160 
Ala Leu Ser Lys Arg Asn Glu Lys Trp Asp Glu Val Tyr Lys Tyr He 
705 710 715 720 

GTA ACA AAT TGG TTA GCA AAG GTT AAT ACA CAG ATT GAT CTA ATA AGA 2208 
Val Thr Asn Trp Leu Ala Lys Val Asn Thr Gin He Asp Leu lie Arg 
725 730 735 

AAA AAA ATG AAA GAA GCT TTA GAA AAT CAA GCA GAA GCA ACA AAG GCT 2256 
Lys Lys Met Lys Glu Ala Leu Glu Asn Gin Ala Glu Ala Thr Lys Ala 
740 745 750 

ATA ATA AAC TAT CAG TAT AAT CAA TAT ACT GAG GAA GAG AAA AAT AAT 2304 
He He Asn Tyr Gin Tyr Asn Gin Tyr Thr Glu Glu Glu Lys Asn Asn 
755 760 765 

ATT AAT TTT AAT ATT GAT GAT TTA AGT TCG AAA CTT AAT GAG TCT ATA 2352 
He Asn Phe Asn He Asp Asp Leu Ser Ser Lys Leu Asn Glu Ser He 
770 775 780 

AAT AAA GCT ATG ATT AAT ATA AAT AAA TTT TTG AAT CAA TGC TCT GTT 2400 
Asn Lys Ala Met He Asn He Asn Lys Phe Leu Asn Gin Cys Ser Val 
785 790 795 800 

TCA TAT TTA ATG AAT TCT ATG ATC CCT TAT GGT GTT AAA CGG TTA GAA 244 8 

Ser Tyr Leu Met Asn Ser Met He Pro Tyr Gly Val Lys Arg Leu Glu 
805 810 815 
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= = 5S = = SS S = = = :: = !: ._ 

AAT AGA GGA ACT TTA ATT rv-m 830 

to **» gj »- s s 3J S S S jg ™ j» « »» „ 

840 P Arg Leu L ys Asp Lys Val 

AAT AAT ACA CTT ACT 845 

Asn Asn Thr 21 £ J£ £? A ™ TTT CAG CTT TCC AAA . 

Thr Asp l Ie Pro phe GlQ CTT TCC AAA TAC GTA 

GAT AAT CAA AGA TTA TTA TCT _ 

» - - *, u. E S » s ^ » „ „ £ 

CCT GGA CCG GAG ACG CTC Tr.r „ $ 880 

•» «y - «. », £ * « S «. « „ g rao 

890 P Ala Le " Gin 

TTC GTG TGT GGA GAC Aar , 895 

920 r XJ - e Vai Asp Glu Cys 

TGC TTC CGG AGC TGT GAT OTA ^ ^ 

- ss - - 4 a j- j- a s s s 5s s s 



(2) INFORMATION FOR SEQ ID NO: 14. 

(D) TOPOLOGY: linear- 
(ii) MOLECULE TYPE: protein 

Met Gin Phe Val Asn Lys Gin Phe Asn tvt r 

5 Asn Tyr Lys Asp Pro Val Asn Qly 

Val Asp He Ala Tyr n e L ys IIe Pro . „ , " 

20 Pro Asn Ala Gly ola Met Qln pro 

Val Lys Ala Phe Lys i le „ is ^ TrD v , T1 " 

35 40 7 Iie Trp Val Ile Pro Glu Arg 

Asp Thr Phe Thr Asn Pro Glu Glu Glv a " 

50 55 G1U Gly Asn Pro Pro Pro Glu 

Ala Lys Gin Val Pro Val Ser Tvr Tvr * 
" 70 ^ Tyr Asp Ser Thr Tyr Leu ^ 

- - - - - ^ u. ^ ^ MM Leu Leu ^ » 

1 t n 



2496 
2544 
2592 
2640 
2668 
2736 
2784 
2832 



940 

953 " XU Aia * 2862 
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Arg Gly He Pro Phe Trp Gly Gly Ser Thr He Asp Thr Glu Leu Lys 
Val He Asp Thr Asn Cys lie Asn Val lie Gin Pro Asp Gly Ser Tyr 



A- Ser Glu Glu Leu Asn Leu Val lie lie Gly Pro Ser Ala Asp lie 
145 150 

He Gin Phe Glu Cys Lys Ser Phe Gly His Glu Val Leu Asn Leu Thr 



165 



Ar g Asn Gly Tyr Gly Ser Thr Gin Tyr lie Arg Phe Ser Pro Asp Phe 

180 J - o:> 



Thr Phe Gly Phe Glu Glu Ser Leu Glu Val Asp Thr Asn Pro Leu Leu 

195 200 
Gly Ala Gly Lys Phe Ala Thr Asp Pro Ala Val Thr Leu Ala His Glu 



210 215 



Leu He His Ala Gly His Arg Leu Tyr Gly He Ala He Asn Pro Asn 

225 230 



^g val Phe Lys Val Asn Thr Asn Ala Tyr Tyr Glu Met Ser Gly Leu 



Glu Val Ser Phe Glu Glu Leu Arg Thr Phe Gly Gly His Asp Ala Lys 

260 265 
Ph e He Asp Ser Leu Gin Glu Asn Glu Phe Arg Leu Tyr Tyr Tyr Asn 



275 280 
Ly6 Phe Lys Asp He Ala Ser Thr Leu Asn Lys Ala Lys Ser He Val 

290 29;> 
Gly Thr Thr Ala Ser Leu Gin Tyr Met Lys Asn Val Phe Lys Glu Lys 
305 310 
Leu Leu Ser 

325 

Tvr Lys Met 

345 



Tyr ^u Leu Ser Glu Asp Thr Ser Gly Lys Phe Ser val Asp Lys Leu 

Lys Phe Asp Lys Leu Tyr Lys Met Leu Thr Glu He Tyr Thr Glu Asp 

340 34b 
Asn Phe val Lys Phe Phe Lys Val Leu Asn Arg Lys Thr Tyr Leu Asn 

355 36U 
Pne Asp Lys Ala Val Phe Lys He Asn Xle Val Pro Lys Val Asn Tyr 



370 



Thr lie Tyr Asp Gly Phe Asn Leu Arg Asn Thr Asn Leu Ala Ala Asn 
385 390 

Phe Asn Gly Gin Asn Thr Glu He Asn Asn Met Asn Phe Thr Lg Leu 



405 



Lys Asn Phe Thr Gly Leu Phe Glu Phe Tyr Lys Leu Leu Cys Val Arg 

420 %M 
Gly He He Thr Ser Lys Thr Lys Ser Leu Asp Lys Gly Tyr Asn Lys 
435 440 

He Glu Gly Arg Cys Asp Gly Ala Leu Asn Asp Leu Cys He Lys Val 



450 
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Ala Glu Glu Asn ti — <? ^ 

j- - - ^ a lu Glo sin ^ ^ ^ ^ 

Pro Asn Gly Lys Lys T yx Glu Leu Asd Lvs t, 

550 AS P ^ g| Thr Met Phe His Ty r 

Leu Arg Ala Gin Glu Phe Glu His Glv r 56 ° 

Gly L ys Ser ^ Ma ^ 

Asn Ser Val Asn Glu Ala r-., r 575 
580 ^ a LeU Leu Asn Pro Ser Arg Val Tvr «. 

585 3 va - L Tyr Thr Phe 

Phe Ser Ser Asp Tvr Vai t,«, r 590 

P ryr Val Lys Lys Val Asn Lys Ala Th,- ri , 

600 y A f a Thr Glu Ala Ala 

Met Phe Leu Gly Tm v*i ni „, 605 

^ ™ S G1 " l0 " ™ ft - Thr tap 01u 
Tar Ser Glu Val s^r tv„- ~. 

625 r Thr Thr Asp Lys Il» ai a » 

630 P yS Ile Asp He Thr He n e 

He Pro Tyr Ile G1 PrQ ^ 

645 est ly ASn Met ie « Tyr Lys 

Asp Asp Phe Val Gly Ala L»„ Tn 655 

„„ * "* I ~ «• £ ■ ** 0l y „. „. ^ ^ 

« - * u. *u a val _ lhr ^ ^ « ^ ^ 

- - s„ Ly . ^ ^ 01 „ Lye ^ ^ ^ ^ ^ ^ ^ 

V.1 Ibr tan Tip L.» m, lys VaJ "° 
Ms y U tan Tbx Gin II. tap zu ^ 

- n. 5- ^ «. ^ tan ^ ^ to 01u Mu ^ ™ ^ ^ 

- «. ta. lla ta P tap ser _ ^ ;; o 5 giu set ne 

Asn Lys Ala Met He Asn a 

Asn He Asn Lys Phe ^ Asn ^ Cys ^ ^ 

Ser Tyr Leu Met Asn Ser Met n e Pro ^ P1 8 °° 
805 JS ly Val Lys ^9 <^ Glu 
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Asp Phe Asp Ala Ser Leu Lys Asp Ala Leu Leu Lys Tyr lie Tyr Asp 
F 820 825 83 

Asn Arg Gly Thr Leu lie Gly Gin Val Asp Arg Leu Lys Asp Lys Val 

835 840 
Asn Asn Thr Leu Ser Thr Asp He Pro Phe Gin Leu Ser Lys Tyr Val 

850 • 855 86 

Asp Asn Gin Arg Leu Leu Ser Thr Phe Thr Glu Tyr He Lys Ser Arg 
865 870 

Pro Gly Pro Glu Thr Leu Cys Gly Ala Glu Leu Val Asp Ala Leu Gin 

885 

Phe val Cys Gly Asp Arg Gly Phe Tyr Phe Asn Lys Pro Thr Gly Tyr 

Gly ser Ser Ser Arg Arg Ala Pro Gin Thr Gly He Val Asp Glu Cys 

915 920 

Cys Phe Arg Ser Cys Asp Leu Arg Arg Leu Glu Met Tyr Cys Ala Pro 

930 935 

Leu Lys Pro Ala Lys Ser Ala Glu Ala * 

945 950 

(2) INFORMATION FOR SEQ ID NO: 15: 

fi) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2724 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE : 

(A) NAME/ KEY : CDS 

(B) LOCATION: 1 . . 2724 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 15: 

*TT CAG TTC GTG AAC AAG CAG TTC AAC TAT AAG GAC CCT GTA AAC GGT 48 
mS Sn Se 5a? Asn Lys Gin Phe Asn Tyr Lys Asp Pro Val Asn Gly 

GTT GAC ATT GCC TAC ATC AAA ATT CCA AAC GCC GGC CAG ATG CAG CCG 96 
5S Sp lie Ala Tyr He Lys lie Pro Asn Ala Gly Gin Met Gin Pro 
20 25 

SS2SSSSSS52SSS2S 192 

50 55 

~ S£ S S 2j s ss = s » s s s s s a 2 " 
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CGT ATT TAT TCC ACT GAC CTG G t ?C CGT ATG CTG CTG ACC TCA ATC GTC 336 

Arg lie Tyr Ser Thr Asp Leu Gly Arg Met Leu Leu Thr Ser lie Val 
100 105 110 

CGC GGA ATC CCA TTT TGG GGT GGC AGT ACC ATT GAC ACG GAG TTG AAG 384 
Arg Gly lie Pro Phe Tip Gly Gly Ser Thr He Asp Thr Glu Leu Lys 
115 120 125 

GTT ATT GAC ACT AAC TGC ATT AAC GTG ATC CAA CCA GAC GGT AGC TAC 432 
Val He Asp Thr Asn Cys He Asn Val He Gin Pro Asp Gly Ser Tyr 
130 135 140 

AGA TCT GAA GAA CTT AAC CTC GTA ATC ATC GGG CCC TCC GCG GAC ATT 480 
Arg Ser Glu Glu Leu Asn Leu Val He He Gly Pro Ser Ala Asp He 
145 150 155 160 

ATC CAG TTT GAG TGC AAG AGC TTT GGC CAC GAA GTG TTG AAC CTG ACG 528 
He Gin Phe Glu Cys Lys Ser Phe Gly His Glu Val Leu Asn Leu Thr 
165 170 175 

CGT AAC GGT TAC GGC TCT ACT CAG TAC ATT CGT TTC AGC CCA GAC TTC 576 
Ara Asn Gly Tyr Gly Ser Thr Gin Tyr He Arg Phe Ser Pro Asp Phe 
180 185 190 

ACG TTC GGT TTC GAG GAG AGC CTG GAG GTT GAT ACC AAC CCG CTG TTG 624 
Thr Phe Gly Phe Glu Glu Ser Leu Glu Val Asp Thr Asn Pro Leu Leu 
195 200 205 

GGT GCA GGC AAG TTC GCA ACT GAT CCA GCG GTG ACC CTG GCA CAC GAG 672 
Glv Ala Gly Lys Phe Ala Thr Asp Pro Ala Val Thr Leu Ala His Glu 
210 - 215 220 

CTG ATC CAC GCC GGT CAT CGT CTG TAT GGC ATT GCG ATT AAC CCG AAC 720 
Leu He His Ala Gly His Arg Leu Tyr Gly He Ala He Asn Pro Asn 
225 230 235 240 

CGC GTG TTC AAG GTT AAC ACC AAC GCC TAC TAC GAG ATG AGT GGT TTA 768 
Arq Val Phe Lys Val Asn Thr Asn Ala Tyr Tyr Glu Met Ser Gly Leu 
245 250 255 

GAA GTA AGC TTC GAG GAA CTG CGC ACG TTC GGT GGC CAT GAT GCG AAG 816 
Glu Val Ser Phe Glu Glu Leu Arg Thr Phe Gly Gly His Asp Ala Lys 
260 265 270 

TTT ATC GAC AGC TTG CAG GAG AAC GAG TTC CGT CTG TAC TAC TAC AAC 864 
Phe He Asp Ser Leu Gin Glu Asn Glu Phe Arg Leu Tyr Tyr Tyr Asn 
275 280 285 

AAG TTT AAA GAT ATT GCA AGT ACA CTG AAC AAG GCT AAG TCC ATT GTG 912 
Lys Phe Lys Asp He Ala Ser Thr Leu Asn Lys Ala Lys Ser He Val 
290 295 300 

GGT ACC ACT GCT TCA TTA CAG TAT ATG AAA AAT GTT TTT AAA GAG AAA 960 
Glv Thr Thr Ala Ser Leu Gin Tyr Met Lys Asn Val Phe Lys Glu Lys 
305 310 315 320 

TAT CTC CTA TCT GAA GAT ACA TCT GGA AAA TTT TCG GTA GAT AAA TTA 1008 
Tvr Leu Leu Ser Glu Asp Thr Ser Gly Lys Phe Ser Val Asp Lys Leu 
7 325 330 335 

AAA TTT GAT AAG TTA TAC AAA ATG TTA ACA GAG ATT TAC ACA GAG GAT 1056 
Lys Phe Asp Lys Leu Tyr Lys Met Leu Thr Glu He Tyr Thr Glu Asp 
340 345 350 

AAT TTT GTT AAG TTT TTT AAA GTA CTT AAC AGA AAA ACA TAT TTG AAT 1104 
Asn Phe Val Lys Phe Phe Lys Val Leu Asn Arg Lys Thr Tyr Leu Asn 

355 360 365 
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TTT GAT AAA GCC GTA TTT AAG ATA AAT ATA GTA CCT AAG GTA AAT TAC 1152 
"e Sp Lys Ala Val Phe Lys lie Asn lie Val Pro Lys Val Asn Tyr 

370 375 
jr . fiTA TAT GAT GGA TTT AAT TTA AGA AAT ACA AAT TTA GCA GCA AAC 1200 
Tte lie Tyr Asp Gly Phe Asn Leu Arg Asn Thr Asn Leu Ala Ala Asn 



365 



TTT AAT GGT CAA AAT ACA GAA ATT AAT AAT ATG AAT TTT ACT AAA CTA 
S Gly Gin Asn Thr Glu lie Asn Asn Met Asn Phe Thr Lys Leu 

... aaT TTT ACT GGA TTG TTT GAA TTT TAT AAG TTG CTA TGT GTA AGA 
Syt A^n S ?hr Gly Leu Phe Glu Phe Tyr Lys Leu Leu Cys Val Arg 
' 420 425 

«3G ATA ATA ACT TCT AAA ACT AAA TCA TTA GAT AAA GGA TAC AAT AAG 
lie lie Thr Ser Lys Thr Lys Ser Leu Asp Lys Gly Tyr Asn Lys 
y 435 440 44b 

ATC GAA GGT CGT TGC GAT GGG GCA TTA AAT GAT TTA TGT ATC AAA GTT 
Se gSu Gly Arg Cys Asp Gly Ala Leu Asn Asp Leu Cys lie Lys Val 

450 455 
... aaT TGG GAC TTG TTT TTT AGT CCT TCA GAA GAT AAT TTT ACT AAT 
£n £n S Sp Leu Phe Phe Ser Pro Ser Glu Asp Asn Phe Thr Asn 
465 470 

GAT CTA AAT AAA GGA GAA GAA ATT ACA TCT GAT ACT AAT ATA GAA GCA 
Si £n Lys Gly Glu Glu lie Thr Ser Asp Thr Asn He Glu Ala 

rra GAA GAA aat ATT AGT TTA GAT TTA ATA CAA CAA TAT TAT TTA ACC 
aS gS5 G^ £n lie Ser Leu Asp Leu He Gin Gin Tyr Tyr Leu Thr 



500 



TTT AAT TTT GAT AAT GAA CCT GAA AAT ATT TCA ATA GAA AAT CTT TCA 
S £n Se Asp Asn Glu Pro Glu Asn He Ser lie Glu Asn Leu Ser 
515 520 3 '" 

arT GAC ATT ATA GGC CAA TTA GAA CTT ATG CCT AAT ATA GAA AGA TTT 
ser So iS lie Gly Gin Leu Glu Leu Met Pro Asn He Glu Arg Phe 

flCi r COC 540 

530 535 
CCT AAT GGA AAA AAG TAT GAG TTA GAT AAA TAT ACT ATG TTC CAT TAT 
pS Asn Gly Lys Lys Tyr Glu Leu Asp Lys Tyr Thr Met Phe Hxs Tyr 
545 550 

PTT CGT GCT CAA GAA TTT GAA CAT GGT AAA TCT AGG ATT GCT TTA ACA 
Leu Arg Ala Gin Glu Phe Glu His Gly Lys Ser Arg He Ala Leu Thr 



AAT TCT GTT AAC GAA GCA TTA TTA AAT CCT AGT CGT GTT TAT ACA TTT 
iS s2 SS £n Glu Ala Leu Leu Asn Pro Ser Arg Val Tyr Thr Phe 

580 585 
TTT TCT TCA GAC TAT GTA AAG AAA GTT AAT AAA GCT ACG GAG GCA GCT 
Te S« || Asp Tyr Val Lys Lys Val Asn Lys Ala Thr Glu Ala Ala 

ATG TTT TTA GGC TGG GTA GAA CAA TTA GTA TAT GAT TTT ACC GAT GAA 
ft£ 2 Sy Trp Val Glu Gin Leu Val Tyr Asp Phe Thr Asp Glu 

610 615 

625 



1248 



1296 



1344 



1392 



1440 



1488 



1536 



1584 



1632 



1680 



1728 



1776 



1824 



1872 



1920 
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ATT CCA TAT ATA GGA CCT GCT TTA AAT ATA GGT AAT ATG TTA TAT AAA 1968 
He Pro Tyr He Gly Pro Ala lieu Asn lie Gly Asn Met Leu Tyr Lys 
645 650 655 

GAT GAT TTT GTA GGT GCT TTA ATA TTT TCA GGA GCT GTT ATT CTG TTA 2016 
Asp Asp Phe Val Gly Ala Leu He Phe Ser Gly Ala Val He Leu Leu 
660 665 670 

GAA TTT ATA CCA GAG ATT GCA ATA* CCT GTA TTA GGT ACT TTT GCA CTT 2064 
Glu Phe He Pro Glu He Ala He Pro Val Leu Gly Thr Phe Ala Leu 
675 680 685 

GTA TCA TAT ATT GCG AAT AAG GTT CTA ACC GTT CAA ACA ATA GAT AAT 2112 
Val Ser Tyr He Ala Asn Lys Val Leu Thr Val Gin Thr He Asp Asn 
690 695 . 700 

GCT TTA AGT AAA AGA AAT GAA AAA TGG GAT GAG GTC TAT AAA TAT ATA 2160 
Ala Leu Ser Lys Arg Asn Glu Lys Trp Asp Glu Val Tyr Lys Tyr He 
705 710 715 720 

GTA ACA AAT TGG TTA GCA AAG GTT AAT ACA CAG ATT GAT CTA ATA AGA 2208 
Val Thr Asn Trp Leu Ala Lys Val Asn Thr Gin He Asp Leu He Arg 
725 730 735 

AAA AAA ATG AAA GAA GCT TTA GAA AAT CAA GCA GAA GCA ACA AAG GCT 2256 
Lvs Lvs Met Lys Glu Ala Leu Glu Asn Gin Ala Glu Ala Thr Lys Ala 
y y 740 745 750 

ATA ATA AAC TAT CAG TAT AAT CAA TAT ACT GAG GAA GAG AAA AAT AAT 2304 
He lie Asn Tyr Gin Tyr Asn Gin Tyr Thr Glu Glu Glu Lys Asn Asn 
755 760 765 

ATT AAT TTT AAT ATT GAT GAT TTA AGT TCG AAA CTT AAT GAG TCT ATA 2352 
He Asn Phe Asn He Asp Asp Leu Ser Ser Lys Leu Asn Glu Ser He 
770 775 780 

AAT AAA GCT ATG ATT AAT ATA AAT AAA TTT TTG AAT CAA TGC TCT GTT 2400 
Asn Lvs Ala Met He Asn He Asn Lys Phe Leu Asn Gin Cys Ser Val 
Vis Y 790 795 800 

TCA TAT TTA ATG AAT TCT ATG ATC CCT TAT GGT GTT AAA CGG TTA GAA 2448 
Ser Tyr Leu Met Asn Ser Met He Pro Tyr Gly Val Lys Arg Leu Glu 
7 805 810 815 

GAT TTT GAT GCT AGT CTT AAA GAT GCA TTA TTA AAG TAT ATA TAT GAT 2496 
Aso Phe Asp Ala Ser Leu Lys Asp Ala Leu Leu Lys Tyr He Tyr Asp 
H 820 825 830 

AAT AGA GGA ACT TTA ATT GGT CAA GTA GAT AGA TTA AAA GAT AAA GTT 2544 
Asn Arg Gly Thr Leu He Gly Gin Val Asp Arg Leu Lys Asp Lys Val 
835 840 845 

AAT AAT ACA CTT AGT ACA GAT ATA CCT TTT CAG CTT TCC AAA TAC GTA 2592 
Asn Asn Thr Leu Ser Thr Asp He Pro Phe Gin Leu Ser Lys Tyr Val 
850 855 860 

GAT AAT CAA AGA TTA TTA TCT ACA TTT ACT GAA TAT ATT AAG TCT AGG 2640 
Asp Asn Gin Arg Leu Leu Ser Thr Phe Thr Glu Tyr He Lys Ser Arg 
8 65 870 875 880 

CCT CAA TCT AAA GTT AAA AGA CAA ATA TTT TCA GGC TAT CAA TCT GAT 2688 
Pro Gin Ser Lys Val Lys Arg Gin He Phe Ser Gly Tyr Gin Ser Asp 
885 890 895 

ATT GAT ACA CAT AAT AGA ATT AAG GAT GAA TTA TGA 2724 
He Asp Thr His Asn Arg He Lys Asp Glu Leu * 

900 905 
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(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 908 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

Met Gin Phe Val Asn Lys Gin Phe Asn Tyr Lys Asp Pro Val Asn Gly 

1 5 10 

Val Asp He Ala Tyr lie Lys lie Pro Asn Ala Gly Gin Met Gin Pro 
20 25 

Val Lys Ala Phe Lys lie His Asn Lys lie Trp Val lie Pro Glu Arg 

35 40 45 

Asn Thr Phe Thr Asn Pro Glu Glu Gly Asp Leu Asn Pro Pro Pro Glu 

50 55 60 

Ala Lys Gin Val Pro Val Ser Tyr Tyr Asp Ser Thr Tyr Leu Ser Thr 
65 

Asp Asn Glu Lys Asp Asn Tyr Leu Lys Gly Val Thr Lys Leu Phe Glu 

Arg He Tyr Ser Thr Asp Leu Gly Arg Met Leu Leu Thr Ser He Val 

3 100 105 

Arg Gly He Pro Phe Trp Gly Gly Ser Thr He Asp Thr Glu Leu Lys 

* 115 120 1 2 => 

Val He Asp Thr Asn Cys lie Asn Val He Gin Pro Asp Gly Ser Tyr 

130 135 
Arg Ser Glu Glu Leu Asn Leu Val He He Gly Pro Ser Ala Asp He 
145 150 

lie Gin Phe Glu Cys Lys Ser Phe Gly His Glu Val Leu Asn Leu Thr 

165. 

Arg Asn Gly Tyr Gly Ser Thr Gin Tyr He Arg Phe Ser Pro Asp Phe 
180 185 

Thr Phe Gly Phe Glu Glu Ser Leu Glu Val Asp Thr Asn Pro Leu Leu 

19 5 200 20 => 

Gly Ala Gly Lys Phe Ala Thr Asp Pro Ala Val Thr Leu Ala His Glu 

210 215 
Leu He His Ala Gly His Arg Leu Tyr Gly lie Ala He Asn Pro Asn 
225 230 235 

Arg Val Phe Lys val Asn Thr Asn Ala Tyr Tyr Glu Met Ser Gly Leu 
* 245 250 

Glu Val Ser Phe Glu Glu Leu Arg Thr Phe Gly Gly His Asp Ala Lys 
260 265 

Phe He Asp Ser Leu Gin Glu Asn Glu Phe Arg Leu Tyr Tyr Tyr Asn 

275 280 
Lys Phe Lys Asp He Ala Ser Thr Leu Asn Lys Ala Lys Ser He Val 
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ffi - », sla ser a „ ^ Met Ly ;^ ; ai ph= ^ ^ 

171 " u ieu sei ss - - - g. ^ s „ « „ Lye ;i: 

335 

Lys Phe Asp Lys Leu Tvr ive m~«- t 

3 J, Tyr Lys Met Leu Thr Glu Ile ^ Thr ^ 

Asn Phe Val Lys Phe Phe Lvs v.l t 

355 Lys val Leu Asn Arg Lys Thr Tyr Leu Asn 

Phe Asp Lys Ala Val Phe Lvs ti~ * 

370 J*J «• Asn lie Val Pro Lys Val Asn Tyr 

Thr He Tyr Asp Gly Phe Asn Leu Arq Asn a , 

385 390 ^ 9 Asn Thr As n ^u Ala Ala Asn 

Ann 

Phe Asn Gly Gin Asn Thr Glu Ile A on > „ 

4Q5 iu lie Asn Asn Met Asn Phe Thr Lys L eu 

Lys Asn Phe Thr Gly Leu Phe Glu Phe Tvr- i r 

420 ^ U I?? ^ Lys Leu Leu Cys Val Arg 

430 

Gly lie lie Thr Ser Lys Thr Lvs c«. r » 

435 Y Tbr Ser ^u Asp Lys Gly Tyr Asn Lys 

He Glu Gly Arg Cys Asp Gly Ala Leu * » 

«0 455 6U As P Leu Cys lie Lys Val 

460 

JJJ «. Trp „ Leu phe s „ ^ ^ ^ ^ ^ 

480 

Asp Leu Asn Lys Glv Glu ri^ n m L • " 

4 lv Glu Glu He Thr Ser Asp Thr Asn lie Glu Ala 

495 

Ala Glu Glu Asn lie <5*»t- t~,. * 

500 L6U AS P J™ !!• Gin Gin Tyr Tyr Leu Thr 

~ »sn jj. ^ to oltt p „ gj ^ ^ ^ ^ ^ ^ 

525 

- £p ne „. „ r 01n a «„ _ ^ ^ iu ^ ^ 

JJJ i- x.y, L ys Jg 01u w ^ lys ffi ^ Met Phe ^ ^ 

aro „ a „. jjj Phe „„ ma ^ ^ ^ ^ ^ ^ •» 

575 

Asn Ser Val Asn Glu Ala r**„ * 

S80 Ala L6U Leu A £ Ser Arg Val Tyr Thr Phe 

K " P ~ «- a ~ «- «« «. «J «. Ala Ma 

« JJJ u. « r Trp « «j G1 „ teu val ^ g Z fc „ „ 

Thr te „. v a . s« T £ Tbl ^ „ e ^ ™ ne ^ ^ 
Ue Pro Tyr ,u gy Pro Ua lau 4an Gly ta „e e leu ^ ™ 



655 
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Asp Asp Phe Val Gly Ala Leu lie Phe Ser Gly Ala Val lie Leu Leu 
6S0 665 

Glu Phe He Pro Glu He Ala He Pro Val Leu Gly Thr Phe Ala Leu 

675 680 685 

val ser Tvr He Ala Asn Lys Val Leu Thr Val Gin Thr He Asp Asn 

695 700 



690 



Ala Leu Ser Lys Arg Asn Glu Lys Trp Asp Glu Val Tyr Lys Tyr lie 
705 710 

Val Thr Asn Trp Leu Ala Lys Val Asn Thr Gin lie Asp Leu lie Arg 

725 7:30 

Lys Lys Met Lys Glu Ala Leu Glu Asn Gin Ala Glu Ala Thr Lys Ala 

1 740 745 

lie He Asn Tyr Gin Tyr Asn Gin Tyr Thr Glu Glu Glu Lys Asn Asn 

755 760 
lie Asn Phe Asn He Asp Asp Leu Ser Ser Lys Leu Asn Glu Ser He 

770 775 
Asn Lys Ala Met He Asn He Asn Lys Phe Leu Asn Gin Cys Ser Val 
785 790 795 

Ser Tyr Leu Met Asn Ser Met He Pro Tyr Gly Val Lys Arg Leu Glu 

Asp.Phe Asp Ala Ser Leu Lys Asp Ala Leu Leu Lys Tyr lie Tyr Asp 

620 

Asn Arg Gly Thr Leu He Gly Gin Val Asp Arg Leu Lys Asp Lys Val 

835 

Asn Asn Thr Leu Ser Thr Asp lie Pro Phe Gin Leu Ser Lys Tyr Val 

850 855 
Asp Asn Gin Arg Leu Leu Ser Thr Phe Thr Glu Tyr He Lys Ser Arg 
865 

Pro Gin Ser Lys Val Lys Arg Gin lie Phe Ser Gly Tyr Gin Ser Asp 
lie Asp Thr His Asn Arg He Lys Asp Glu Leu * 



900 



(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3042 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION:!. .3042 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
^ TTP OTG aac AAG CAG TTC AAC TAT AAG GAC CCT GTA AAC GGT 

Met 32 S3 ^ Lys Gin Phe Asn Tyr Lys Asp Pro Val Asn Gly 



48 
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GTT GAC ATT GCC TAC ATC AAA ATT CCA AAC GCC GGC CAG ATG CAG CCG 
Val Asp lie Ala Tyr He Lys lie Pro Asn Ala Gly Gin Met Gin Pro 
20 25 30 

GTG AAG GCT TTC AAG ATT CAT AAC AAA ATC TGG GTT ATT CCG GAA CGC 
Val Lys Ala Phe Lys He His Asn Lys He Trp Val He Pro Glu Arg 
35 40 45 

GAT ACA TTT ACG AAC CCG GAA GAA GGA GAC TTG AAC CCG CCG CCG GAA 
Asp Thr Phe Thr Asn Pro Glu Glu Gly Asp Leu Asn Pro Pro Pro Glu 
50 55 60 

GCA AAG CAG GTG CCA GTT TCA TAC TAC GAT TCA ACC TAT CTG AGC ACA 
Ala Lys Gin Val Pro Val Ser Tyr Tyr Asp Ser Thr Tyr Leu Ser Thr 
65 70 75 80 

GAC AAC GAG AAG GAT AAC TAC CTG AAG GGA GTG ACC AAA TTA TTC GAG 
Asp Asn Glu Lys Asp Asn Tyr Leu Lys Gly Val Thr Lys Leu Phe Glu 

* 85 90 95 

CGT ATT TAT TCC ACT GAC CTG GGC CGT ATG CTG CTG ACC TCA ATC GTC 
Arg He Tyr Ser Thr Asp Leu Gly Arg Met Leu Leu Thr Ser He Val 
100 105 110 

CGC GGA ATC CCA TTT TGG GGT GGC AGT ACC ATT GAC ACG GAG TTG AAG 
Arq Gly He Pro Phe Trp Gly Gly Ser Thr He Asp Thr Glu Leu Lys 
115 120 125 

GTT ATT GAC ACT AAC TGC ATT AAC GTG ATC CAA CCA GAC GGT AGC TAC 
Val He Asp Thr Asn Cys He Asn Val He Gin Pro Asp Gly Ser Tyr 
130 135 140 

AGA TCT GAA GAA CTT AAC CTC GTA ATC ATC GGG CCC TCC GCG GAC 'ATT 
Arq Ser Glu Glu Leu Asn Leu Val He He Gly Pro Ser Ala Asp He 
145 150 155 160 

ATC CAG TTT GAG TGC AAG AGC TTT GGC CAC GAA GTG TTG AAC CTG ACG 
He Gin Phe Glu Cys Lys Ser Phe Gly His Glu Val Leu Asn Leu Thr 
165 170 175 

CGT AAC GGT TAC GGC TCT ACT CAG TAC ATT CGT TTC AGC CCA GAC TTC 
Arq Asn Gly Tyr Gly Ser Thr Gin Tyr He Arg Phe Ser Pro Asp Phe 

* 180 185 190 

ACG TTC GGT TTC GAG GAG AGC CTG GAG GTT GAT ACC AAC CCG CTG TTG 
Thr Phe Gly Phe Glu Glu Ser Leu Glu Val Asp Thr Asn Pro Leu Leu 
195 200 205 

GGT GCA GGC AAG TTC GCA ACT GAT CCA GCG GTG ACC CTG GCA CAC GAG 
Gly Ala Gly Lys Phe Ala Thr Asp Pro Ala Val Thr Leu Ala His Glu 
210 215 220 

CTG ATC CAC GCC GGT CAT CGT CTG TAT GGC ATT GCG ATT AAC CCG AAC 
Leu He His Ala Gly His Arg Leu Tyr Gly He Ala He Asn Pro Asn 
225 230 235 240 

CGC GTG TTC AAG GTT AAC ACC AAC GCC TAC TAC GAG ATG AGT GGT TTA 
Ara Val Phe Lys Val Asn Thr Asn Ala Tyr Tyr Glu Met Ser Gly Leu 
3 245 250 255 

GAA GTA AGC TTC GAG GAA CTG CGC ACG TTC GGT GGC CAT GAT GCG AAG 
Glu Val Ser Phe Glu Glu Leu Arg Thr Phe Gly Gly His Asp Ala Lys 
260 265 270 



96 



144 



192 



240 



288 



336 



384 



432 



480 



528 



576 



624 



672 



720 



768 



816 



TTT ATC GAC AGC TTG CAG GAG AAC GAG TTC CGT CTG TAC TAC TAC AAC 
Phe He Asp Ser Leu Gin Glu Asn Glu Phe Arg Leu Tyr Tyr Tyr Asn 
275 2 80 2 85 



864 
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AAG TTT AAA GAT ATT GCA AGT ACA CTG AAC AAG GCT AAG TCC ATT GTG 912 
lvs Phe Lys Asp He Ala Ser Thr Leu Asn Lys Ala Lys Ser He Val 
7 290 295 300 

GGT ACC ACT GCT TCA TTA CAG TAT ATG AAA AAT GTT TTT AAA GAG AAA 
Glv Thr Thr Ala Ser Leu Gin Tyr Met Lys Asn Val Phe Lys Glu Lys 
305 310 315 320 



385 



TTT AAT GGT CAA AAT ACA GAA ATT AAT AAT ATG AAT TTT ACT AAA CTA 
Phe Asn Gly Gin Asn Thr Glu He Asn Asn Met Asn Phe Thr Lys Leu 



405 



CCT AAT GGA AAA AAG TAT GAG TTA GAT AAA TAT ACT ATG TTC CAT TAT 
Pro Sn fly Lys Lys Tyr Glu Leu Asp Lys Tyr Thr Met Phe Bx. Tyr 



960 



TAT CTC CTA TCT GAA GAT ACA TCT GGA AAA TTT TCG GTA GAT AAA TTA 1008 
Tvr Leu Leu Ser Glu Asp Thr Ser Gly Lys Phe Ser Val Asp Lys Leu 
lyr 325 330 335 

AAA TTT GAT AAG TTA TAC AAA ATG TTA ACA GAG ATT TAC ACA GAG GAT 1056 
Lvs Phe Asp Lys Leu Tyr Lys Met Leu Thr Glu lie Tyr Thr Glu Asp 
y 340 345 350 

,. T GTT aag TTT TTT AAA GTA CTT AAC AGA AAA ACA TAT TTG AAT 1104 

Asn Phe Val Lys Phe Phe Lys Val Leu Asn Arg Lys Thr Tyr Leu Asn 
355 360 365 

TTT GAT AAA GCC GTA TTT AAG ATA AAT ATA GTA CCT AAG GTA AAT TAC 1152 
Phe Asp Lys Ala Val Phe Lys He Asn He Val Pro Lys Val Asn Tyr 
370 375 380 

ArA ATA TAT GAT GGA TTT AAT TTA AGA AAT ACA AAT TTA GCA GCA AAC 1200 
Thr lie Tyr Asp Gly Phe Asn Leu Arg Asn Thr Asn Leu Ala Ala Asn 



1248 



AAA AAT TTT ACT GGA TTG TTT GAA TTT TAT AAG TTG CTA TGT GTA AGA 1296 
Lys Asn Phe Thr Gly Leu Phe Glu Phe Tyr Lys Leu Leu Cys Val Arg 
420 425 430 

«3G ATA ATA ACT TCT AAA ACT AAA TCA TTA GAT AAA GGA TAC AAT AAG 1344 
Gly He lie Thr Ser Lys Thr Lys Ser Leu Asp Lys Gly Tyr Asn Lys 
435 440 445 

ATC GAA GGT CGT TGC GAT GGG <3CA TTA AAT GAT TTA TGT ATC AAA GTT 1392 
He Glu Gly Arg Cys Asp Gly Ala Leu Asn Asp Leu Cys He Lys Val 

450 455 4 

AAT AAT TGG GAC TTG TTT TTT AGT CCT TCA GAA GAT AAT TTT ACT AAT 1440 
is*n £s"n Trp Asp Leu Phe Phe Ser Pro Ser Glu Asp Asn Phe Thr Asn 
465 470 475 

GAT CTA AAT AAA GGA GAA GAA ATT ACA TCT GAT ACT AAT ATA GAA GCA 1488 
Sp Asn Lys Gly Glu Glu lie Thr Ser Asp Thr Asn He Glu Ala 
F 435 490 

GCA GAA GAA AAT ATT AGT TTA GAT TTA ATA CAA CAA TAT TAT TTA ACC 1536 
S Glu Asn lie ser Leu Asp Leu He Gin Gin Tyr Tyr Leu Thr 
500 505 

TTT AAT TTT GAT AAT GAA CCT GAA AAT ATT TCA ATA GAA AAT CTT TCA 1584 
£2 !£n Phe Asp Asn Glu Pro Glu Asn He Ser He Glu Asn Leu Ser 

515 ^ 20 

AGT GAC ATT ATA GGC CAA TTA GAA CTT ATG CCT AAT ATA GAA AGA TTT 1632 
£5 Sp lie lie Gly Gin Leu Glu Leu Met Pro Asn He Glu Arg Phe 
530 535 540 



1680 



545 550 555 
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CTT CGT GCT CAA GAA TTT GAA CAT GGT AAA TCT AGG ATT GCT TTA ACA 1728 

Leu Arg Ala Gin Glu Phe Glu His Gly Lys Ser Arg He Ala Leu Thr 

565 570 575 

AAT TCT GTT AAC GAA GCA TTA TTA AAT CCT AGT CGT GTT TAT ACA TTT 1776 
Asn Ser Val Asn Glu Ala Leu Leu Asn Pro Ser Arg Val Tyr Thr Phe 
580 585 590 

TTT TCT TCA GAC TAT GTA AAG AAA GTT AAT AAA GCT ACG GAG GCA GCT 1824 
Phe Ser Ser Asp Tyr Val Lys Lys Val Asn Lys Ala Thr Glu Ala Ala 
595 600 605 

ATG TTT TTA GGC TGG GTA GAA CAA TTA GTA TAT GAT TTT ACC GAT GAA 1872 
Met Phe Leu Gly Trp Val Glu Gin Leu Val Tyr Asp Phe Thr Asp Glu 
610 615 620 

ACT AGC GAA GTA AGT ACT ACG GAT AAA ATT GCG GAT ATA ACT ATA ATT 1920 
Thr Ser Glu Val Ser Thr Thr Asp Lys He Ala Asp He Thr He He 
625 630 635 640 

ATT CCA TAT ATA GGA CCT GCT TTA AAT ATA GGT AAT ATG TTA TAT AAA 1968 
He Pro Tyr He Gly Pro Ala Leu Asn He Gly Asn Met Leu Tyr Lys 
645 650 655 

GAT GAT TTT GTA GGT GCT TTA ATA TTT TCA GGA GCT GTT ATT CTG TTA 2016 
Asp Asp Phe Val Gly Ala Leu lie Phe Ser Gly Ala Val He Leu Leu 
660 665 670 

GAA TTT ATA CCA GAG ATT GCA ATA CCT GTA TTA GGT ACT TTT GCA CTT 2064 
Glu Phe He Pro Glu He Ala lie Pro Val Leu Gly Thr Phe Ala Leu 
675 680 685 

GTA TCA TAT ATT GCG AAT AAG GTT CTA ACC GTT CAA ACA ATA GAT AAT 2112 
Val Ser Tyr He Ala Asn Lys Val Leu Thr Val Gin Thr He Asp Asn 
690 695 700 

GCT TTA AGT AAA AGA AAT GAA AAA TGG GAT GAG GTC TAT AAA TAT ATA 2160 
Ala Leu Ser Lys Arg Asn Glu Lys. Trp Asp Glu Val Tyr Lys Tyr He 
70S 710 715 720 

GTA ACA AAT TGG TTA GCA AAG GTT AAT ACA CAG ATT GAT CTA ATA AGA 2208 
Val Thr Asn Trp Leu Ala Lys Val Asn Thr Gin He Asp Leu He Arg 
725 730 735 

AAA AAA ATG AAA GAA GCT TTA GAA AAT CAA GCA GAA GCA ACA AAG GCT 2256 
Lvs Lvs Met Lys Glu Ala Leu Glu Asn Gin Ala Glu Ala Thr Lys Ala 
1 740 745 750 

ATA ATA AAC TAT CAG TAT AAT CAA TAT ACT GAG GAA GAG AAA AAT AAT 2304 
He He Asn Tyr Gin Tyr Asn Gin Tyr Thr Glu Glu Glu Lys Asn Asn 
755 760 765 

ATT AAT TTT AAT ATT GAT GAT TTA AGT TCG AAA CTT AAT GAG TCT ATA 2352 
He Asn Phe Asn He Asp Asp Leu Ser Ser Lys Leu Asn Glu Ser He 
770 775 780 

AAT AAA GCT ATG ATT AAT ATA AAT AAA TTT TTG AAT CAA TGC TCT GTT 2400 
Asn Lvs Ala Met He Asn He Asn Lys Phe Leu Asn Gin Cys Ser Val 
Vis ^ 790 795 800 

TCA TAT TTA ATG AAT TCT ATG ATC CCT TAT GGT GTT AAA CGG TTA GAA 2448 
Ser Tyr Leu Met Asn Ser Met He Pro Tyr Gly Val Lys Arg Leu Glu 
805 810 815 

GAT TTT GAT GCT AGT CTT AAA GAT GCA TTA TTA AAG TAT ATA TAT GAT 2496 
Asd Phe Asp Ala Ser Leu Lys Asp Ala Leu Leu Lys Tyr He Tyr Asp 

... 820 B25 830 
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AAT AGA GGA ACT TTA ATT GGT CAA OTA GAT AGA TTA AAA GAT AAA GTT 2544 
Asn Arg Gly Thr Leu He Gly Gin Val Asp Arg Leu Lys Asp Lys Val 
835 840 845 

AAT AAT ACA CTT AGT ACA GAT ATA CCT TTT CAG CTT TCC AAA TAC GTA 2592 
Asn Asn Thr Leu Ser Thr Asp He Pro Phe Gin Leu Ser Lys Tyr Val 
850 855 860 

GAT AAT CAA AGA TTA TTA TCT ACA TTT ACT GAA TAT ATT AAG TCA GGC 2640 
Asp Asn Gin Arg Leu Leu Ser Thr Phe Thr Glu Tyr He Lys Ser Gly 
865 870 

CTG AAT TCC CCG GGT GCA GCT CAT TAT GCG CAA CAC GAT GAA GCC GTA 2688 
Leu Asn Ser Pro Gly Ala Ala His Tyr Ala Gin His Asp Glu Ala Val 
885 890 895 

GAC AAC AAA TTC AAC AAA GAA CAA CAA AAC GCG TTC TAT GAG ATC TTA 2736 
Asp Asn Lys Phe Asn Lys Glu Gin Gin Asn Ala Phe Tyr Glu He Leu 

CAT TTA CCT AAC TTA AAC GAA GAA CAA CGA AAC GCC TTC ATC CAA AGT 2784 
His Leu Pro Asn Leu Asn Glu Glu Gin Arg Asn Ala Phe lie Gin Ser 
9X5 920 925 

TTA AAA GAT GAC CCA AGC CAA AGC GCT AAC CTT TTA GCA GAA GCT AAA 2832 
Leu Lvs Asp Asp Pro Ser Gin Ser Ala Asn Leu Leu Ala Glu Ala Lys 
930 935 940 

AAG CTA AAT GAT GCT CAG GCG CCG AAA GTA GAC AAC AAA TTC AAC AAA 2880 
Lys Leu Asn Asp Ala Gin Ala Pro Lys Val Asp Asn Lys Phe Asn Lys 
945 950 955 

GAA CAA CAA AAC GCG TTC TAT GAG ATC TTA CAT TTA CCT AAC TTA AAC 2928 
Glu Gin Gin Asn Ala Phe Tyr Glu He Leu His Leu Pro Asn Leu Asn 
965 970 975 

GAA GAA CAA CGA AAC GCC TTC ATC CAA AGT TTA AAA GAT GAC CCA AGC 2976 
Glu Glu Gin Arg Asn Ala Phe He Gin Ser Leu Lys Asp Asp Pro Ser 
980 985 990 

CAA AGC GCT AAC CTT TTA GCA GAA GCT AAA AAG CTA AAT GAT GCT CAG 3024 
Gin Ser Ala Asn Leu Leu Ala Glu Ala Lys Lys Leu Asn. Asp Ala Gin 
995 1000 1005 



GCG CCG AAA GTA GAC TAG 
Ala Pro Lys Val Asp * 
1010 

(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1014 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

Met Gin Phe Val Asn Lys Gin Phe Asn Tyr Lys Asp Pro Val Asn Gly 
! 5 10 " 

Val Asp lie Ala Tyr He Lys He Pro Asn Ala Gly Gin Met Gin Pro 

20 25 30 

Val Lys Ala Phe Lys He His Asn Lys He Trp Val lie Pro Glu Arg 
ic 40 45 



3042 
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Asp Thr Phe Thr Asn Pro Glu Glu Glv a«=^ t 

50 £ Glu G1 y As P Leu Asn Pro Pro Pro Glu 

60 

Ala Lys Gin Val Pro Val Ser Tvr 7w * 

70 ^ ^ ASP S « Th * Tyr Leu Ser Thr 

80 

Asp Asn Glu Lys Asp Asn Ty r Leu Lys Gly Val Thr r T 

85 in yS Leu Phe Glu 

95 

Arg He Tyr Ser Thr Asp Leu Glv Ara t 

100 * ±y ~f Met Leu Leu Thr Ser He Val 

3 110 
Arg Gly He Pro Phe Trp Gly Gly Ser Thr B ^ 

115 y ber Tnr Ile Asp Thr Glu Leu Lvs 

125 

Val Ile Asp Thr Asn Cys He Asn ^ 

130 Y £t Asn Val n * Gin Pro Asp Gly Ser Tyr 

J3 140 r 

Arg Ser Glu Glu Leu Asn Leu Val lie ti* ' « 

145 150 U Val Ile He Gly Pro Ser Ala Asp He 

Ile Gin Phe Glu Cys Lys Ser Phe Gly His Glu Val t » 

165 Glu Va l Leu Asn Leu Thr 

/u 175 

Arg Asn Gly Tyr Gly Ser Thr Gin Tyr He Ara dk c 

180 f|f lle ^ »e Ser Pro Asp Phe 

AO:> 190 
*hr Ph* «j Ph, S1 „ Gl u s«r u. „, Vll ^ ^ ^ 

205 

«, MJ «, L y s Ph. M. T g Asp P„ „„ val ^ ^ ^ ^ 

Uj He ». Ala 01y g. Leu ^ Gly ™ ^ to ^ 

240 

<*. ~ Ph. ^ V,l M . ,« ila ^ ^ r ^ 

4250 255 
Glu Val Ser Phe Glu Glu Leu Ara Thr D h » r-i 

260 ^ 9 ™f Phe G1 y Q ly His Asp Ala Lys 

,d5S 270 
Phe lie Asp ser Leu Gin Glu Asn Glu Phe Ar g Leu Tyr Tyr ^ ^ 

Lys Phe Lys Asp lie Ala Ser Thr Leu Asn Lys Ala Lys Ser lie Val 

Gly Thr Thr Ala Ser Leu Gin Tvr Met- t\,o a 

305 310 * Xn TV* Met Asn Val Phe Lys Glu Lys 

315 320 
TVr u. M u ser g, „ Iht SM Sly ^ ^ ^ ^ ^ ^ 

W Ph. ^ J*. Leu Ty, Lys H.t Leu Thr Gl u , U ^ ^ ^ 

350 

Asn Phe val Lys Phe Phe Lys Val Leu Asn Arg Lys Thr Tyr Leu Asn 

JbU 365 

Phe Asp Lys Ala Val Phe Lys He Asn ti p Val n T 

370 ±xe Asn Ile Va l Pro Lys Val Asn Tyr 

3 380 

Thr lie Tyr Asp Gly Phe Asn Leu Arg Asn Thr Asn Leu Ala Ala Asn 

395 400 
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Phe Asn Gly Gin Asn Tar Glu lie A sn Asn Met Asn Phe Thr Lys Leu 

405 HAU 
Ly. As. Phe Thr Gly L» Phe Ola Phe Tyr Lys Leu Leu Cys V.l Arg 

<ny .i« n. Z — T1 " JJ! s « — Lys » ^ "* ^ 

435 440 



II. al» Gly Arg Cys Asp Gly Ala Leu Asn Asp Leu Cys He Lys val 

450 4 " 
Asn Asn Trp Asp Leu Phe Phe Ser Pro Ser Glu *sp *=» - ~ J- 
465 . 
Asp Leu Asn Lys Gly «- Glu He Thr Ser Asp Thr A.n He Glu AU 

Ala Glu Glu A,n lie Ser Leu Asp Leu He Gin Gin Tyr Tyr Leu Thr 

500 bu:> 
Ph. Asn Phe Asp Asn Glu Pro Glu Asn He Ser He Glu Asn « ser 



515 

s« asp tie He Gly Gin Leu Glu Leu Met Pro Asn Lie Glu Arc Phe 

Pro » Gly Lys Lys Tyr GXu Leu Asp Lys Tyr Thr Her Phe His Tyr 

545 ' 

Leu Arg Al. Gin Glu Phe Glu His Gly Ly; ser Ar 3 He Ala Leu Thr 
As. ser vel As; Glu Ala Leu Leu Asn Pro Ser Are v.l Tyr Thr Phe 
». Ser ser Z ^ « «- gg val » Ly. Ala Thr Glu Al. Ala 

„« Phe Z «y Trp val Glu Gin Leu val Tyr «, Phe Thr Asp Glu 

610 615 
Thr ser Glu Val Ser Thr Thr Asp Lys He Ala Asp He Thr He He 

625 

Ile Pro Tyr Xle Gly Pro Ala Leu Asn He Gly Asn Me, Leu Tyr Lys 
^ Asp Phe Val Gly Ala Leu He Phe Ser Gly Ala Val He Leu Leu 
Glu Phe He Z Glu He Ala lie Pro Val Leu Gly Thr Phe Ala^u 



Val Ser Tyr He Ala Asn Lys 



Val Leu Thr Val Gin Thr He Asp Asn 



700 



Al. Leu ser Ly. «X9 » Glu Lys Trp Asp Glu val Tyr Ly. Tyr He 

705 „, _ _ 

V.1 Thr Asn Trp Leu Al. Ly. V.1 Asn Thr Gin He Asp Leu Uj Aro 

Ly. Ly. Me, Lys Glu Ala Leu Glu Asn Gin Al. Glu Ala Thr Ly. Al. 

Y 740 /q3 
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He lie Asn Tyr Gin Tyr Asn rln - 

Ola Tyr Thr Glu fllu ^ ^ ^ 

"a A |n Phe Asn ne 

775 Ser l Y» Leu Asn Glu Ser u e 

Asn Lys Ala Met Il e Asn ti » 8 ° 

785 5lo 6 ASn Lys Ph * Leu Asn Gin „ 

790 7 «" Asn Gin Cys Ser Val 

Ser Tyr Leu Met Asn Ser Met Tip n 800 

He Pro 01y Val ^ ^ 

Asp Phe Asp Ala Ser Leu Lys Asp Ala x 8 " 

Asp Ala Leu Leu Ly S Tyr Ile ^ 

Asn Arg Gly Thr Leu He G7„ „, 83 ° 

Gly 01. val Asp Arg Leu Lys Asp Lys Val 

Asn Asn Thr Leu Ser Thr Asp Ile Pro phfi 

8SS ° Phe G1 « Jeu Ser Lys Tyr Val 

a " Q1 " ^ - ss -» - - * set „ 

Leu Asn Ser Pro G1 880 

Tyr Ala Gin His Asp Glu Ala ^ 

Leu Lys Asp Asp Pro Sern„ o 

«. Ser Ala Asn L eu Leu Ala Glu Ala Lys 

Ly| Leu Asn Asp Ala Gin Ala Pro Lys Val a* ' 

950 hya Val Asp Asn Lys Phe Asn Ly s 

Glu Gin Gin Asn Ala Phe Tyr Glu II- t *** 
965 Tyr Glu lie Leu His Leu Pro Asn Leu Asn 

Glu Glu Gin Arg Asn Ala p he Ile q 975 

Gin ser Leu Lys Asp Asp Pro Ser 

Gin Ser Ala Asn Leu Leu Ala G l„ h , "° 

995 S?0 Ma Lys **» Asn Asp Ala Gin 

Ala ProLys Val Asp * 1005 

(2) INFORMATION FOR SEQ~ID NO: 19 ; 

(i) SEQUENCE CHARACTERISTICS • 
(A) LENGTH: 3509 hZZZ ■ 
W TYPE: nucfeL^!/"" 
C STRANDEDNESS: double 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic, 

(ix) FEATURE: 

(A) NAME/ KEY: CDS 

(B) LOCATION: 1. .3509 

(Xi) SEQUENCE ^CKIPTION: SEQ I D No : I9: 
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ATG CCA GTT ACA ATA AAT AAT TTT AAT TAT AAT GAT CCT ATT GAT AAT 
Met Pro Val Thr lie Asn Asn Phe Asn Tyr Asn Asp Pro He Asp Asn 
! 5 XO 15 

AAT AAT ATT ATT ATG ATG GAG CCT CCA TTT GCG AGA GGT ACG GGG AGA 
Asn Asn He He Met Met Glu Pro, Pro Phe Ala Arg Gly Thr Gly Arg 
20 25 30 

TAT TAT AAA GCT TTT AAA ATC ACA GAT CGT ATT TGG ATA ATA CCG GAA 
Tvr Tvr Lys Ala Phe Lys He Thr Asp Arg He Trp lie He Pro Glu 
* 35 40 45 

AGA TAT ACT TTT GGA TAT AAA CCT GAG GAT TTT AAT AAA AGT TCC GGT 
Ara Tvr Thr Phe Gly Tyr Lys Pro Glu Asp Phe Asn Lys Ser Ser Gly 
y 50 55 60 

ATT TTT AAT AGA GAT GTT TGT GAA TAT TAT GAT CCA GAT TAC TTA AAT 
He Phe Asn Arg Asp Val Cys Glu Tyr Tyr Asp Pro Asp Tyr Leu Asn 
65 70 75 80 

ACT AAT GAT AAA AAG AAT ATA TTT TTA CAA ACA ATG ATC AAG TTA TTT 
Thr Asn Asp Lys Lys Asn He Phe Leu Gin Thr Met He Lys Leu Phe 
85 90 95 

AAT AGA ATC AAA TCA AAA CCA TTG GGT GAA AAG TTA TTA GAG ATG ATT 
Asn Arg He Lys Ser Lys Pro Leu Gly Glu Lys Leu Leu Glu Met lie 
100 105 HO 

ATA AAT GGT ATA CCT TAT CTT GGA GAT AGA CGT GTT CCA CTC GAA GAG 
He Asn Gly He Pro Tyr Leu Gly Asp Arg Arg Val Pro Leu Glu Glu 
115 120 125 

TTT AAC ACA AAC ATT GCT AGT GTA ACT GTT AAT AAA TTA ATC AGT AAT 
Phe Asn Thr Asn He Ala Ser Val Thr Val Asn Lys Leu lie Ser Asn 
130 135 140 

CCA GGA GAA GTG GAG CGA AAA AAA GGT ATT TTC GCA AAT TTA ATA ATA 
Pro Gly Glu Val Glu Arg Lys Lys* Gly He Phe Ala Asn Leu He He 
145 150 155 160 

TTT GGA CCT GGG CCA GTT TTA AAT GAA AAT GAG ACT ATA GAT ATA GGT 
Phe Glv Pro Gly Pro Val Leu Asn Glu Asn Glu Thr lie Asp lie Gly 
1 165 170 175 

ATA CAA AAT CAT TTT GCA TCA AGG GAA GGC TTC GGG GGT ATA ATG CAA 
He Gin Asn His Phe Ala Ser Arg Glu Gly Phe Gly Gly He Met Gin 
180 185 190 

ATG AAG TTT TGC CCA GAA TAT GTA AGC GTA TTT AAT AAT GTT CAA GAA 
Met Lys Phe Cys Pro Glu Tyr Val Ser Val Phe Asn Asn Val Gin Glu 
195 200 205 

AAC AAA GGC GCA AGT ATA TTT AAT AGA CGT GGA TAT TTT TCA GAT CCA 
Asn Lys Gly Ala Ser He Phe Asn Arg Arg Gly Tyr Phe Ser Asp Pro 
210 215 220 

GCC TTG ATA TTA ATG CAT GAA CTT ATA CAT GTT TTA CAT GGA TTA TAT 
Ala Leu He Leu Met His Glu Leu He His Val Leu His Gly Leu Tyr 
230 235 240 

GGC ATT AAA GTA GAT GAT TTA CCA ATT GTA CCA AAT GAA AAA AAA TTT 
Gly He Lys Val Asp Asp Leu Pro He Val Pro Asn Glu Lys Lys Phe 
245 250 255 

TTT ATG CAA TCT ACA GAT GCT ATA CAG GCA GAA GAA CTA TAT ACA TTT 
Phe Met Gin Ser Thr Asp Ala He Gin Ala Glu Glu Leu Tyr Thr Phe 
260 265 270 



48 



96 



144 



192 



240 



288 



336 



384 



432 



480 



528 



576 



624 



672 



720 



768 



816 
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GGA GGA CAA GAT rvr. *„„ 

— - £ S £S5g S 2 s « „. « 4rc 

280 ier Th * Asp Lys Ser J*C 

TAT GAT AAA GTT TTG CAA »»„ 285 

E si a a a ? r p «* ccr Mc „ 1° 

cy. n. Ser ^ p Pro J* }» mt j„ „ T raT 

315 n xl e Tvr 

AAA AAT AAA TTT AAA GAT a»» 320 
I-ys Asn Lys Phe ^ ^ AAA TAT AAA TTC GTT GAA ^ TCT ■ 

32s p Lys Tyr Lys Phe Val Glu 2n q? ®° «» 
330 ASp Ser Glu Qiy 

AAA TAT AGT ATA GAT GTA CAA »«. 335 

ATG TTT GGT TTT ara o** 35 <> 

-~s"SS£ssaasasa 

ACT AGA GCT TOT tbt ~. 365 

-st-^-ssssaasaasis' 

AAT TTA TTA GAT AAT CM a ™ 

s »25sas*- 2fig . 

410 Asn Ala He 

AAT AAA CAA GCT TAT paa 41 5 

- «. a ^ « ffi « gta raT 

425 u was Leu Ala Val Tyr 

TAT ATA GAA AAT ran w« « 480 

"""•--Ssssaaasa-gg 

TTA ATA AGT AAA ATA ru -m., 495 
^ Ser Lys S 21 S S 2°* GM ACA GAA TCA ^ 

Pro ser G lu Asn Thr gj J? £ 

GAT TTT AAT* P m» _ 510 

ASP Phe JS Val ST pS ?? TAT GAA AAA CAA CCC GCT 

Pro v.l ryr Glu Lys gA CCC GCT ATA AAA 

AAA ATT TTT ACA GAT GAA AAT lw 

■» - * sp « s a s a as « ™ «. TCT « 

535 St 5 Leu Ser Gin 



864 
912 
960 
1008 
1056 
1104 
1152 
1200 
1248 
1296 
1344 
1392 
1440 
1488 
1536 
1584 
1632 
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-r-r-r rrr OTA GAT ATA AGA GAT ATA AGT TTA ACA TCT TCA TTT GAT 
S S S S S. Arg Asp lie Ser Leu Thr Ser Ser Phe Asp 

545 550 

„ „„„ cjuTK TTT TCT AAC AAA GTT TAT TCA TTT TTT TCT ATG GAT 

Asp a£ £u S S Asn Lys' Val Tyr Ser Phe Phe Ser Met Asp 

„_ ... , rT GCT aat AAA GTG GTA GAA GCA GGA TTA TTT GCA GOT 
JE"X $ S S IS Lys val val Glu Ala Oly Leu Phe Ala Oly 
1 580 585 

ita GTA AAT GAT TTT GTA ATC GAA GCT AAT AAA AGC 
S S? K SS S SE E Phe Va! He clu Ma >. Ur. ~ 
595 600 

PAT AAA ATT GCA GAT ATA TCT CTA ATT GTT CCT TAT ATA 
£ S E S 51 S Ala Asp Xle Ser Leu lie Val Pro Tyr He 
610 51b 

625 

i-gESSssssssassaa 

645 " 

CCT gta GTT GGA GCC TTT TTA TTA GAA TCA TAT ATT 

SJ S S S S S vS Gly Ala Phe Leu Leu Glu Ser Tyr Xle 

660 665 

sssgssssgsasassss 

S £ S£ K S S i K 5S S = S -S S ffi 5 

690 695 
705 710 

ssasSasaijBSjaiasssssE 
sjsssssssssasssaiss 

755 760 

--SSSSSSSSS5S SSSS 

770 77!> 
785 790 



1680 



1728 



1776 



1824 



1872 



1920 



1968 



2016 



2064 



2112 



2160 



2208 



2256 



2304 



2352 



2400 



2448 
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TTG ATT GGA AGT GCA GAA TAT GAA AAA TCA AAA GTA AAT AAA TAG TTG 2496 
Leu He Gly Ser Ala Glu Tyr Glu Lys Ser Lys Val Asn Lys Tyr Leu 
820 825 830 

AAA ACC ATT ATG CCG TTT GAT CTT TCA ATA TAT ACC AAT GAT ACA ATA 2544 
Lys Thr He Met Pro Phe Asp Leu Ser He Tyr Thr Asn Asp Thr He 
835 840 845 

CTA ATA GAA ATG TTT AAT AAA TAT AAT AGC GAA ATT TTA AAT AAT ATT 2592 
Leu He Glu Met Phe Asn Lys Tyr Asn Ser Glu He Leu Asn Asn He 
850 855 860 

ATC TTA AAT TTA AGA TAT AAG GAT AAT AAT TTA ATA GAT TTA TCA GGA 2640 
He Leu Asn Leu Arg Tyr Lys Asp Asn Asn Leu He Asp Leu Ser Gly 
865 670 875 880 

TAT GGG GCA AAG GTA GAG GTA TAT GAT GGA GTC GAG CTT AAT GAT AAA 2688 
Tyr Gly Ala Lys Val Glu Val Tyr Asp Gly Val Glu Leu Asn Asp Lys 
885 890 895 

AAT CAA TTT AAA TTA ACT AGT TCA GCA AAT AGT AAG ATT AGA GTG ACT 2736 
Asn Gin Phe Lys Leu Thr Ser Ser Ala Asn Ser Lys He Arg Val Thr 
900 905 910 

CAA AAT CAG AAT ATC ATA TTT AAT AGT GTG TTC CTT GAT TTT AGC GTT 2784 
Gin Asn Gin Asn He He Phe Asn Ser Val Phe Leu Asp Phe Ser Val 
915 920 925 

AGC TTT TGG ATA AGA ATA CCT AAA TAT AAG AAT GAT GGT ATA CAA AAT 2832 
Ser Phe Trp He Arg He Pro Lys Tyr Lys Asn Asp Gly He Gin Asn 
930 935 940 

TAT ATT CAT AAT GAA TAT ACA ATA ATT AAT TGT ATG AAA AAT AAT TOG 2880 
Tyr He His Asn Glu Tyr Thr He He Asn Cys Met Lys Asn Asn Ser 
945 950 955 960 

GGC TGG AAA ATA TCT ATT AGG GGT. AAT AGG ATA ATA TGG ACT TTA ATT 2928 
Gly Trp Lys lie Ser lie Arg Gly Asn Arg He lie Trp Thr Leu He 
965 970 975 

GAT ATA AAT GGA AAA ACC AAA TCG GTA TTT TTT GAA TAT AAC ATA AGA 2976 
Asp He Asn Gly Lys Thr Lys Ser Val Phe Phe Glu Tyr Asn lie Arg 
980 985 990 

GAA GAT ATA TCA GAG TAT ATA AAT AGA TGG TTT TTT GTA ACT ATT ACT 3024 
Glu Asp He Ser Glu Tyr He Asn Arg Trp Phe Phe Val Thr lie Thr 
995 1000 1005 

AAT AAT TTG AAT AAC GCT AAA ATT TAT ATT AAT GGT AAG CTA GAA TCA 3072 
Asn Asn Leu Asn Asn Ala Lys lie Tyr lie Asn Gly Lys Leu Glu Ser 
1010 1015 1020 

AAT ACA GAT ATT AAA GAT ATA AGA GAA GTT ATT GCT AAT GGT GAA ATA 3120 
Asn Thr Asp lie Lys Asp lie Arg Glu Val lie Ala Asn Gly Glu lie 
1025 1030 1035 1040 

ATA TTT AAA TTA GAT GGT GAT ATA GAT AGA ACA CAA TTT ATT TGG ATG 3168 
lie Phe Lys Leu Asp Gly Asp lie Asp Arg Thr Gin Phe lie Trp Met 
1045 1050 1055 

AAA TAT TTC AGT ATT TTT AAT ACG GAA TTA AGT CAA TCA AAT ATT GAA 3216 
Lvs Tvr Phe Ser lie Phe Asn Thr Glu Leu Ser Gin Ser Asn He Glu 
7 Y 1060 1065 1070 

GAA AGA TAT AAA ATT CAA TCA TAT AGC GAA TAT TTA AAA GAT TTT TGG 3264 
Glu Arg Tyr Lys He Gin Ser Tyr Ser Glu Tyr Leu Lys Asp Phe Trp 
!075 1080 1085 
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SJS!SSSS2!li2SSB22S5 

1090 1095 

1105 1110 

^ nrr tat AAT CAA AAT TCT AAA TAT ATA AAT TAT 

S 2» S S £ E "S2 s,- «*• s.* 

- s = M s si s gs a a a sj.s s 

S2SaS55BSSSES,SS5 



1155 1160 



CTA GA 
Leu 



(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 
■ (A) LENGTH: 1169 amino acids 
(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

!M!SSSl P ^t n S E QX DN O: 2 0: 
Met Pro val Thr He Asn Asn Phe Asn Tyr Asn Asp Pro He Asp Asn 

1 Asn He He Met Met Glu Pro Pro Phe Ala Arg Gly Thr Sly Arg 

20 " 
^ ryr Lys Ala Phe Lys He Thr Asp Arg He Trp XI. He Pro Glu 

^ ryr Thr Phe Gly Tyr Lys Pro Glu Asp Phe Asn Lys Ser Ser Gly 

Xle Z Asn Arg Asp Val Cys Glu Tyr Tyr Asp Pro Asp Tyr Leu Asn 

2 Asn Asp Lys Lys Asn He Phe Leu Gin Thr Met He Lys Leu Phe 

^n Arg He Lys Ser Lys Pro Leu Gly Glu Lys Leu Leu Glu Met He 

100 AU3 
U . — Or He Pro Tyr Leu Gly Asp Aro Val Pro Leu «. «. 

115 120 
Pne Asn Thr Asn He Ala Ser Val Thr Val Asn Lys Leu He Ser Asn 

Pro Gly Glu val Glu Arg Lys Lys Gly He Phe Ala Asn Leu He He 
Phe «, 1, « val Leu « «. « □»« ~ «» & G1 " 

loo 



3312 



3360 



3408 



3456 



3504 



3509 
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He Gin Asn His Phe Ala * 

Ser Arg g Gly phe Gly Qiy ne G ^ 

190 

~ »; p,o «. ^ „ ser val Phe ^ ^ m 

- g. «, ua _ ne g. A „ tog ^ My ^ »' ^ ^ ^ 

Ala Leu He Leu Met Hie r-i„ r 

225 230 ±±e Hls Leu His Gly Leu Tyr 

Gly lie Lys Val Asp Asp Leu Pro lie Val Pm a ^ 
245 116 J*J Pr ° Asn Glu Lys Lys Phe 

Phe Met Gin Ser Thr Asp Ala He Gin n „, 

260 P Ma Ile G £ Ala Glu Glu Leu Tyr Thr Phe 

270 

Gly Gly Gin Asp Pro Ser He Tl«. -ru „ 

275 2J0 ^ Pr ° Ser Th * **P *V Ser lie 

_ 285 
Tyr Asp Lys Val Leu Gin A<?n m,« * 

Asn Phe Arg Gly He val Asp Arg Leu Asn 

Lys Val Leu Val Cvs H« c«>- * 

Cys lie ser Asp Pro ^ Asn ^ ^ ^ ^ 

Lys Asn Lys Phe Lys Asp Lvs Tvr t,~ „, 32 ° 
325 P ^ ^ »• Val Glu Asp Ser Glu Gly 

335 

Lys Tyr Ser Ile Asp Val Glu Ser Ph» » 

34Q P Glu ser Phe Asp L ys Le U Tyr Lys Ser Leu 

Met Phe Gly Phe Thr Glu Thr Asn ri. al „, ^ 

355 Thr Asn He Ala Glu Asn Tyr Lys He Lys 

Thr Arg Ala Ser Tvr Phe c«» * 

yr Phe ser ^ ser Leu pro ^ ^ ^ 8 ^ e ^ 

Asn Leu Leu Asp Asn Glu H e Thr „ 

385 390 Thr lle G1U G1 * Phe Asn lie 

Ser Asp Lys Asp Met Glu Lys Glu Tyr Arc Glv m 

• 405 lyr G1 y Gin Asn Lys Ala He 

415 

Asn Lys Gin Ala Tyr Glu Glu H e r ™ 

420 11111 Ile J« L ^ Glu His Leu Ala Val Tyr 

Lys He Gin Met Cys Lys Ser Val Lys Ala Pro n Tn 

435 440 ° Gly Ile He Asp 

445 

Val Asp Asn Glu Asp Leu Phe Phe 11 „ m » 

45° 45I Ue Ala Asn Ser Phe Ser 

460 

- - ser Lys SS G1 " =• «• gj »- Tht « a s « ^ 

Tyr He Glu Asn Asp Phe Pro H P » „, 480 
48 P Pro He Asn Glu Leu He Leu Asp Thr Asp 

Leu He Ser Lys He Glu Leu Pro s»r ri, » 

500 SU Pro f« G1 " Asn Thr Glu Ser Leu Thr 

Asp Phe Asn Val Asp Val Pro val Tvr ri„ r 

SIS JJJ *y* Lys Gin Pro Ala lie Lys 



PCT/GB97/02273 

WO 98/07864 

- 93 - 

Lys lie Phe Thr Asp Glu Asn Thr He Phe Gin Tyr Leu Tyr Ser Gin 

530 535 
Thr Phe Pre Leu Asp lie Arg Asp He Ser Leu Thr Ser Ser Phe Asp 

545 550 

Asp Ala Leu Leu Phe Ser Asn Lys Val Tyr Ser Phe Phe Ser Met Asp 

565 

Tyr He Lys Thr Ala Asn Lys Val Val Glu Ala Gly Leu Phe Ala Gly 

1 580 b " 

Trp val Lys Gin He Val Asn Asp Phe Val lie Glu Ala Asn Lys Ser 



595 

Asn Thr Met Asp Lys He Ala Asp He Ser Leu He Val Pro Tyr He 



Gly Leu Ala Leu Asn Val Gly Asn Glu Thr Ala Lys Gly Asn Phe Glu 

625 , 

Asn Ala Phe Glu He Ala Gly Ala Ser lie Leu Leu Glu Phe He Pro 

64 D 



Glu Leu Leu He Pro Val Val Gly Ala Phe Leu Leu Glu Ser Tyr He 



Asp Asn Lys Asn Lys He He Lys Thr He Asp Asn Ala Leu Thr Lys 

675 680 
Arg Asn Glu Lys Trp Ser Asp Met Tyr Gly Leu He Val Ala Gin Trp 

690 ^ 
Leu Ser Thr Val Asn Thr Gin Phe Tyr Thr lie Lys Glu Gly Met Tyr 
705 710 

L ys Ala Leu Asn Tyr Gin Ala Gin Ala Leu Glu Glu He He Lys Tyr 



725 

740 



Arg Tyr Asn lie Tyr Ser Glu Lys Glu Lys Ser Asn He Asn He Asp 



Phe Asn Asp He Asn Ser Lys Leu Asn Glu Gly He Asn Gin Ala He 

Asp Asn III Asn Asn Phe He Asn Gly Cys Ser Val Ser Tyr Leu Met 

770 

hys Lys Met He Pro Leu Ala Val Glu Lys Leu Leu Asp Phe Asp Asn 
Z Leu Lys Lys Asn Leu Leu Asn Tyr He Asp Glu Asn Lys Leu Tyr 



805 

820 



,e U He Gly Ser Ala Glu Tyr Glu Lys Ser Lys Val Asn Lys Tyr Leu 



Lys Thr He Met Pro Phe Asp Leu Ser He Tyr Thr Asn Asp Thr He 

835 840 
L eu He Glu Met Phe Asn Lys Tyr Asn Ser Glu He Leu Asn Asn He 
850 855 

He Leu Asn Leu Arc, Tyr Lys Asp Asn Asn Leu He Asp Leu Ser Gly 

865 
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Tyr Gly Ala Lys Val Glu Val Tyr Astj G 1v v t „, 

885 ^ As P G Jy Val Glu Leu Asn Asp Lys 

Asn Gin Phe Lys Leu Thr Ser Ser Ala a«„ c 

900 9ot Ser Lys Ile Val Thr 

910 

Gin Asn Gin Asn lie n« * 

He Phe Asn Ser Val P he Leu Asp Phe Ser Va l 

925 

.» Trp "* to3 ,ta SS l " - a zu 01 » _ 

& Ile Hls - Glu a? Ih * - g- ~ ta to ser 

Gly Tip Lys lie Ser II. *«, ttl j.. 960 
SS5 3 my Asn Arg u, n e Trp Thr Leu II. 

975 

Asp He Asn Gly Lys Thr Lys Ser Val Ph« »w 

980 gfs Phe Phe Glu ^ Asn He Arg 

990 

Glu Asp He Ser Glu Tvr 11^ n e « * 

Tyr lie As^Arg Trp Phe Phe Val Thr n e Thr 

100S 

Asn Asn Leu Asn Asn Ala Lvs Hp >tw r, 

1010 JSs ^ 116 Asn J^V" ^ Glu Ser 

Asn Thr Asp He L ys Asp He Arg Glu Val He 

1025 1030 S Val JJ^* 1 * Asn Q ly Glu ile 

He Phe Lys Leu Asp Gly Asx> Hp a * 1040 
10?s 7 ^ 116 *** A 0 r | fl T ^ Gin Phe He Trp Met 

Lys Tyr Phe Ser He Phe Asn Th^ n, 1055 

1060 *■» Thr JJ« Ser Gin Ser Asn lie Glu 

„ 1070 

^ ^ Ile 01n s „ ^ gio ^ ^ ^ ^ ^ ^ 

G1 " Ja*» ^ J- * - Tyr Tyr Met „. 01y 

1100 1 

Asn Lys Asn Ser Tvr 11^ r * » c * _ 

1105 H10 LyS ^ LyS L ^ S g 5 S-r Pro Val Gly Glu 

He Leu Thr Arg Ser Lys Tyr Asn Gin a * 112 ° 
1125 y ^ Gln J« Ser Lys Tyr He Asn Tyr 

1135 

Arg Asp Leu Tyr He Gly Glu Lvs Ph P tt t1 

1140 Y U Lys J h | 5 He He Arg Arg Lys Ser Asn 

Ser Gin Ser He Asn Asp Asd II- v a i » , 

1155 P ASP JJ| 0 Val Arg Lys Glu Asp Tyr He Tyr 



Leu 



(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS • 

(A) LENGTH: 2574 base cairc 

B) TYPE: nucleic ac!/ 

C) STRANDEDNESS : double 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



WO 98/07864 



PCT/GB97/02273 



- 95 - 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION :1. .2574 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

^ rr* PTT ACA ATA AAT AAT TTT AAT TAT AAT GAT CCT ATT GAT AAT 
S£ SS JS lie Asn Asn Phe Asn Tyr Asn Asp Pro lie Asp Asn 

1 5 
«*a. a at ATT ATT ATG ATG GAG CCT CCA TTT GCG AGA GGT ACG GGG AGA 
S S Met Met Glu Pro Pro Phe Ala Arg Gly Thr Gly Arg 

20 25 

_ m ... nrT ttt AAA ATC ACA GAT CGT ATT TGG ATA ATA CCG GAA 
SSSSSSStt. Asp Arg lie Trp lie He Pro Glu 



TAT act TTT GGA TAT AAA CCT GAG GAT TTT AAT AAA ACT TCC GOT 
S ™ ?S Phe Gly Tyr Lys Pro Glu Asp Phe Asn Lys Ser Ser Gly 



50 



ATT TTT AAT AGA GAT GTT TGT GAA TAT TAT GAT CCA GAT TAC TTA AAT 
S: !te £g Asp Val Cys Glu Tyr Tyr Asp Pro Asp Tyr Leu Asn 
65 70 75 

£SSSS2222SS22SSS 

8 5 ^ 
tvtva tp A AAA CCA TTG GGT GAA AAG TTA TTA GAG ATG ATT 
IS £ ne s£ 2£ S Su Gly Glu Lys, Leu Leu Glu Met He 
100 105 
_ „_ rrT tat CTT GGA GAT AGA CGT GTT CCA CTC GAA GAG 

S US G?y S Pro t£ Su Gly Asp Arg Arg Val Pro Leu Glu Glu 
115 120 
„ *™ TVTvo att GOT AGT GTA ACT GTT AAT AAA TTA ATC AGT AAT 
£ £n £ £n S SI K Val Thr Val Asn Lys Leu lie Ser Asn 
130 135 

165 

rrn TrA &GG GAA GGC TTC GGG GGT ATA ATG CAA 

S £ 2 S 5 S S, - *r «nr g; •« 

180 185 
„ _ ™ r rrA GAA TAT GTA AGC GTA TTT AAT AAT GTT CAA GAA 
K ™ S S K Val Ser Val Phe Asn Asn Val Gin Glu 
195 200 

SSSS2222SSSSSESSS 

210 215 

, -™ atp CAT GAA CTT ATA CAT GTT TTA CAT GGA TTA TAT 
S E S S S S S! S He Hi, »jl Le» His «y L» gr 



48 



96 



144 



192 



240 



288 



336 



384 



432 



480 



528 



576 



624 



672 



720 
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GGC ATT AAA GTA GAT GAT TTA CCA ATT GTA CCA AAT GAA AAA AAA TTT 768 
Glv He Lys Val Asp Asp Leu Pro He Val Pro Asn Glu Lys Lys Phe 
r 245 250 255 

TTT ATG CAA TCT ACA GAT GCT ATA CAG GCA GAA GAA CTA TAT ACA TTT 816 
Phe Met Gin Ser Thr Asp Ala He Gin Ala Glu Glu Leu Tyr Thr Phe 
260 265 270 

GGA GGA CAA GAT CCC AGC ATC ATA 'ACT CCT TCT ACG GAT AAA AGT ATC 864 

Glv Glv Gin Asp Pro Ser He He Thr Pro Ser Thr Asp Lys Ser lie 
U Y 1 275 280 285 

TAT GAT AAA GTT TTG CAA AAT TTT AGA GGG ATA GTT GAT AGA CTT AAC 912 

Tvr Asp Lys Val Leu Gin Asn Phe Arg Gly He Val Asp Arg Leu Asn 
y 290 295 , 300 

AAG GTT TTA GTT TGC ATA TCA GAT CCT AAC ATT AAT ATT AAT ATA TAT 960 
Lys Val Leu Val Cys He Ser Asp Pro Asn He Asn He Asn He Tyr 



305 



310 315 320 



AAA AAT AAA TTT AAA GAT AAA TAT AAA TTC GTT GAA GAT TCT GAG GGA 1008 
Lvs Asn Lys Phe Lys Asp Lys Tyr Lys Phe Val Glu Asp Ser Glu Gly 
1 . 325 330 335 

AAA TAT AGT ATA GAT GTA GAA AGT TTT GAT AAA TTA TAT AAA AGC TTA 1056 
Lvs Tvr Ser lie Asp Val Glu Ser Phe Asp Lys Leu Tyr Lys Ser Leu 
* J 340 345 350 

ATG TTT GGT TTT ACA GAA ACT AAT ATA GCA GAA AAT TAT AAA ATA AAA 1104 
Met Phe Gly Phe Thr Glu Thr Asn He Ala Glu Asn Tyr Lys He Lys 
355 360 365 

ACT AGA GCT TCT TAT TTT AGT GAT TCC TTA CCA CCA GTA AAA ATA AAA 1152 
Thr Arg Ala Ser Tyr Phe Ser Asp Ser Leu Pro Pro Val Lys He Lys 
370 375 380 

AAT TTA TTA GAT AAT GAA ATC TAT ACT ATA GAG GAA GGG TTT AAT ATA 1200 
Asn Leu Leu Asp Asn Glu He Tyr Thr He Glu Glu Gly Phe Asn He 
395 390 395 400 

TCT GAT AAA GAT ATG GAA AAA GAA TAT AGA GGT CAG AAT AAA GCT ATA 1248 
Ser Asp Lys Asp Met Glu Lys Glu Tyr Arg Gly Gin Asn Lys Ala He 
405 410 415 

AAT AAA CAA GCT TAT GAA GAA ATT AGC AAG GAG CAT TTG GCT GTA TAT 1296 
Asn Lvs Gin Ala Tyr Glu Glu lie Ser Lys Glu His Leu Ala Val Tyr 
420 425 430 

AAG ATA CAA ATG TGT AAA AGT GTT AAA GCT CCA GGA ATA TGT ATT GAT 1344 
Lvs He Gin Met Cys Lys Ser Val Lys Ala Pro Gly He Cys He Asp 
1 435 440 445 

GTT GAT AAT GAA GAT TTG TTC TTT ATA GCT GAT AAA AAT AGT TTT TCA 1392 
Val Asp Asn Glu Asp Leu Phe Phe He Ala Asp Lys Asn Ser Phe Ser 



450 



455 460 



GAT GAT TTA TCT AAA AAC GAA AGA ATA GAA TAT AAT ACA CAG AGT AAT 1440 
Asp Asp Leu Ser Lys Asn Glu Arg lie Glu Tyr Asn Thr Gin Ser Asn 
* 470 475 480 



465 



TAT ATA GAA AAT GAC TTC CCT ATA AAT GAA TTA ATT TTA GAT ACT GAT 1488 
Tyr He Glu Asn Asp Phe Pro He Asn Glu Leu He Leu Asp Thr Asp 
485 490 495 

TTA ATA AGT AAA ATA GAA TTA CCA AGT GAA AAT ACA GAA TCA CTT ACT 1536 
Leu He Ser Lys He Glu Leu Pro Ser Glu Asn Thr Glu Ser Leu Thr 

500 50S 510 
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CAT TTT AAT GTA GAT GTT CCA GTA TAT GAA AAA CAA CCC GCT ATA AAA 
JS Asn Val Asp Val Pro Val Tyx Glu Lys Gin Pro Ala lie Lys 



515 



... atT TTT ACA GAT GAA AAT ACC ATC TTT CAA TAT TTA TAC TCT CAG 

i£ JS K S Glu Asn Thr ne Phe Gln 21 Leu Tyr 

530 535 



TTT CCT CTA GAT ATA AGA GAT ATA AGT TTA ACA TCT TCA TTT GAT 
T S hZ pS ^ Asp He Arg Asp He Ser Leu Thr Ser Ser Phe ^ 
545 550 _ 55 

TrT AAC AAA GTT TAT TCA TTT TTT TCT ATG GAT 

Sp K S S -i ^ ser P he P he Ser m* as P 

A CT GCT AAT AAA GTG GTA GAA GCA GGA TTA TTT GCA GGT 

?X SI Si S ^ a «*■ val si Glu Ala Gly So Ala Gly 

1 580 585 

«S^SS E 2j S !S 2S2J5SS 

6X0 

„ OB ___ GCT AAT GTA GGA AAT GAA ACA GCT AAA GGA AAT TTT GAA 

S Si S Si £n Val Gly Asn Glu Thr Ala Lys Gly Asn Phe Glu 
625 630 
^ __, ttt GAG ATT GCA GGA GCC AGT ATT CTA CTA GAA TTT ATA CCA 
K 22 S! 51 Ala Gly Ala Ser lie Leu Leu Glu Phe lie Pro 
645 

~ph. tta ata CCT GTA GTT GGA GCC TTT TTA TTA GAA TCA TAT ATT 
Si S Si S S2 Val Gly Ala Phe Leu Leu Glu Ser Tyr He 

660 5bb 

att ATT AAA ACA ATA GAT AAT GCT TTA ACT AAA 

675 680 
« ™* n*a TfiG AGT GAT ATG TAC GGA TTA ATA GTA GCG CAA TGG 

S S £S ~ S £ 1 «« «* - « «» *»■ 

690 695 

* ~™ ptt A tat ACT CAA TTT TAT ACA ATA AAA GAG GGA ATG TAT 
S ESElXSSSSSTyr Thr lie Lys Glu Gly Met Tyr 

705 710 

as£sS£SK££"E£S£SS 

740 7 " 

2S5SS2S2SS5SS2SS 



1584 



1632 



1680 



1728 



1776 



1824 



1872 



1920 



1968 



2016 



2064 



2112 



2160 



2208 



2256 



2304 



2352 



770 775 
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AAA AAA ATG ATT CCA TTA GCT GTA GAA ana 

LJJ ^ H.t „. Pro uj *J ™ « S 2 £ ™ « J» 

S 2: & K j£ 2 ™ « » s « « « T m 1° 

805 iyr jjj Glu Asn Lys Leu Tyr 

815 

TTG ATT GGA AGT GCA GAA TAT paa *n* r™, 

«. .u «, se, ffi £ 3; £ g - ™ iS £ £ 2 

830 

AAA ACC ATT ATG CCG TTT GAT CTT Trn ^ m 

Lys Thr lie Met Pro Phe J£ E E Tie 5J J? ^ T GAT ACA A ™ 
835 P 840 * er Ile ^Vr Thr Asn Asp Thr lie 

CTA ATA GAA ATG TTT AAT AAA TAT AAT AGC 
Leu lie Glu Met Phe Asn Lys Tyr S sir 



2400 

2448 

2496 

2544 

2574 



(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS- 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22- 

m. Pro « Ihr u. _ to Pte ^ ^ ^ ^ ^ ^ 

is 

A.n ,„ xj. M« « sltt Pro ^ phe ^ ^ ^ ^ ^ 

* iyr xj. „. ». ,„ Ile aj „ ^ Ue „ Ile Ile ° ^ mu 

45 

Arg Tyr Thr Phe Gly Tyr Lvs Pro n„ * 

50 7 ^| Pro Glu Phe Asn Lys Ser Ser Gly 

60 

Ue Phe Asn Arg A sp Val Cys Glu Tyr Tyr Asn p™ a ^ 
65 70 ^ A jr* Asp Pro Asp Tyr Leu Asn 

80 

Thr Asn Asp Lys Lys Asn He Phe Leu Gin Thr Met n. Lys ^ phe 

90 95 
Asn Arg lie Lys Ser Lys Pro Leu Glv n„ T T 

100 Y 6U Glu L ^ s Leu Leu Glu Met He 

AU * 110 
n. *- «J H. Pro Tyr Leu „ ^ ^ ^ ^ ^ 

" w 125 
- S Tbr _ n. Ma g val fc val ^ ^ ^ ue ^ ^ 

Pro «, « lu « Glu ^ ^ ^ Qly ne ^ ^ ^ ne 

155 160 
Phe o lr Pro 01y val ^ ^ Glu GJu ^ ^ ^ 

i/0 175 
He Gin Asn His Phe Ala Ser Arg Glu Glv Pt, a -i 

180 9 ipc y Phe G1 y G1 Y He Met Gin 

185 190 
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Met Lys Phe Cys Pro Glu Tyr Val Ser Val Phe Asn Asn Val Gin Glu 

195 200 
Asn Lys Gly Ala Ser lie Phe Asn Arg Arg Gly Tyr Phe Ser Asp Pro 

210 215 
Ala Leu lie Leu Met His Glu Leu He His Val Leu His Gly Leu Tyr 
225 ^® 

Gly lie Lys Val Asp Asp Leu Pro lie Val Pro Asn Glu Lys Lys Phe 

Phe „et Gin Ser Thr Asp Ala He Gin Ala Glu Glu Leu Tyr Thr Phe 

260 265 
Gly Gly Gin Asp Pro Ser lie lie Thr Pro Ser Thr Asp Lys Ser He 

275 280 
Tyr Asp Lys Val Leu Gin Asn Phe Arg Gly He Val Asp Arg Leu Asn 

■ 290 29b 
Lys Val Leu Val Cys He Ser Asp Pro Asn He Asn He Asn He Tyr 

Z Asn Lys Phe Lys Asp Lys Tyr Lys Phe Val Glu Asp Ser Glu Gly 

Lys Tyr Ser He Asp Val Glu Ser Phe Asp Lys Leu Tyr Lys Ser Leu 

1 340 345 

„ et Phe Gly Phe Thr Glu Thr Asn He Ala Glu Asn Tyr Lys He Lys 

355 360 
Xhr Arg Ala Ser Tyr Phe Ser Asp Ser Leu Pro Pro Val Lys He Lys 

370 375 
Asn i^u Leu Asp Asn Glu He Tyr Thr lie Glu Glu Gly Phe Asn He 

sir Asp Lys Asp Met Glu Lys Glu Tyr Arg Gly Gin Asn Lys Ala He 

405 * x 
Asn L ys Gin Ala Tyr Glu Glu He Ser Lys Glu His Leu Ala Val Tyr 

420 4 " 
Lys He Gin Met Cys Lys Ser Val Lys Ala Pro Gly He Cys He Asp 

435 440 
val Asp Asn Glu Asp Leu Phe Phe He Ala Asp Lys Asn Ser Phe Ser 

450 455 
Asp Asp Leu Ser Lys Asn Glu-Arg He Glu Tyr Asn Thr Gin Ser Asn 
465 470 

^ xle Glu Asn Asp Phe Pro He Asn Glu Leu He Leu Asp Thr Asp 
485 

He Ser Lys He Glu Leu Pro Ser Glu Asn Thr Glu Ser Leu Thr 
500 50;> 
Asp Phe Asn Val Asp Val Pro Val Tyr Glu Lys Gin Pro Ala Xle Lys 

515 520 
Lys He Phe Thr Asp Glu Asn Thr He Phe Gin Tyr Leu Tyr Ser Gin 
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Thr Phe Pro Leu Asp lie Arg Asp lie Ser Leu Thr Ser Ser Phe Asp 
545 550 555 560 

Asp Ala Leu Leu Phe Ser Asn Lys Val Tyr Ser Phe Phe Ser Met Asp 
565 570 575 

Tyr lie Lys Thr Ala Asn Lys Val Val Glu Ala Gly Leu Phe Ala Gly 
580 585 590 

Trp Val Lys Gin lie Val Asn Asp Phe Val He Glu Ala Asn Lys Ser 
595 600 605 

Asn Thr Met Asp Lys He Ala Asp He Ser Leu He Val Pro Tyr He 
610 615 620 

Gly Leu Ala Leu Asn Val Gly Asn Glu Thr Ala Lys Gly Asn Phe Glu 
625 630 635 640 

Asn Ala Phe Glu He Ala Gly Ala Ser lie Leu Leu Glu Phe He Pro 
645 650 655 

Glu Leu Leu He Pro Val Val Gly Ala Phe Leu Leu Glu Ser Tyr He 
660 665 670 

Asp Asn Lys Asn Lys He He Lys Thr He Asp Asn Ala Leu Thr Lys 
675 680 685 

Arg Asn Glu Lys Trp Ser Asp Met Tyr Gly Leu He Val Ala Gin Trp 
690 695 700 

Leu Ser Thr Val Asn Thr Gin Phe Tyr Thr He Lys Glu Gly Met Tyr 
70S 710 715 720 

Lvs Ala Leu Asn Tyr Gin Ala Gin Ala Leu Glu Glu He He Lys Tyr 
Y 725 730 735 

Arg Tyr Asn He Tyr Ser Glu Lys Glu Lys Ser Asn He Asn He Asp 
740 745 750 

Phe Asn Asp He Asn Ser Lys Leu Asn Glu Gly He Asn Gin Ala He 
755 760 765 

Asp Asn He Asn Asn Phe He Asn Gly Cys Ser Val Ser Tyr Leu Met 
770 - 775 780 

Lvs Lys Met He Pro Leu Ala Val Glu Lys Leu Leu Asp Phe Asp Asn 
785 790 795 800 

Thr Leu Lys Lys Asn Leu Leu Asn Tyr He Asp Glu Asn Lys Leu Tyr 
805 810 815 

Leu He Gly Ser Ala Glu Tyr Glu Lys Ser Lys Val Asn Lys Tyr Leu 
820 825 830 

Lys Thr He Met Pro Phe Asp Leu Ser He Tyr Thr Asn Asp Thr He 
835 840 845 

Leu He Glu Met Phe Asn Lys Tyr Asn Ser 
850 855 

(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1644 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME/ KEY : CDS 

(B) LOCATION : 1 . .1644 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

nrn RTa AAT AAT TTT AAT TAT AAT GAT CCT ATT GAT AAT 48 
S £ VaT S £n iE ™ Asn Tyr Asn Asp Pro lie Asp Asn 

*m^, A^r* t\tp rar CCT CCA TTT GCG AGA GGT ACG GGG AGA 96 
£n JS Ue S S £ 22 S S JE Ala Arg Oly Thr Qly Arg 



*n R rrT TTT AAA ATC ACA GAT CGT ATT TGG ATA ATA CCG GAA 

J£ K J2 Si ?le Thr Asp "* Ile Trp x is Ile pro 



35 



« »^ ttt nra TAT AAA CCT GAG GAT TTT AAT AAA AGT TCC GGT 
S Tyr S 5S ^ ^ S Glu Asp Phe Asn Lys Ser Ser Gly 



50 



m - arp rTT T r;T GAA TAT TAT GAT CCA GAT TAC TTA AAT 

S £ £n fi -P S Ss ™ Tyr Asp Pro Asp Tyr Leu Asn 
65 70 75 

... >>t &TR TTT TTA CAA ACA ATG ATC AAG TTA TTT 

S £ £p $ JS S SI ™ ffi «- «« n-- - «- 

85 5,0 

.»* ,n ATC AAA TCA AAA CCA TTG GGT GAA AAG TTA TTA GAG ATG ATT 
£n S Xle ^ K Lys Pro Leu Gly Glu Lys Leu Leu Glu Met lie 
100 105> 

*t» i&t GGT ATA CCT TAT CTT GGA GAT AGA CGT GTT CCA CTC GAA GAG 
i2 G?y ?2 pS Tyr Leu Gly Asp Arg Arg Val Pro Leu Glu Glu 

^ ... AAC ATT GCT AGT GTA ACT GTT AAT AAA TTA ATC AGT AAT 

£n tS -S a" Ser Val Thr Val Asn Lys Leu lie Ser Asn 

130 135 

145 150 

~r,c* rra rrr TTA AAT GAA AAT GAG ACT ATA GAT ATA GGT 

5 S £ 5 £ 5 25 5X L, *. «. «. n. «. g; 

_ _ _ „. T QCA TCA AGG GAA GGC TTC GGG GGT ATA ATG CAA 

S 31 iE 21 S K s2 Arg Glu Gly Phe Gly Gly II. ~t Oln 

180 185 



144 



192 



240 



288 



336 



384 



432 



480 



528 



576 



624 



672 



210 215 
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GCC TTG ATA TTA ATG CAT GAA CTT ATA CAT GTT TTA CAT GGA TTA TAT 720 
Ala Leu lie Leu Met His Glu Leu lie His Val Leu His Gly Leu Tyr 
225 230 235 240 

GGC ATT AAA GTA GAT GAT TTA CCA ATT GTA CCA AAT GAA AAA AAA TXT 768 
Glv He Lys Val Asp Asp Leu Pro He Val Pro Asn Glu Lys Lys Phe 
245 250 255 

TTT ATG CAA TCT ACA GAT GCT ATA CAG GCA GAA GAA CTA TAT ACA TTT 816 
Phe Met Gin Ser Thr Asp Ala He Gin Ala Glu Glu Leu Tyr Thr Phe 
260 265 270 

GGA GGA CAA GAT CCC AGC ATC ATA ACT CCT TCT ACG GAT AAA AGT ATC 864 
Gly Gly Gin Asp Pro Ser He He Thr Pro Ser Thr Asp Lys Ser He 
275 280 285 

TAT GAT AAA GTT TTG CAA AAT TTT AGA GGG ATA GTT GAT AGA CTT AAC 912 
Tyr Asp Lys Val Leu Gin Asn Phe Arg Gly He Val Asp Arg Leu Asn 
290 295 300 

AAG GTT TTA GTT TGC ATA TCA GAT CCT AAC ATT AAT ATT AAT ATA TAT 960 
Lys Val Leu Val Cys He Ser Asp Pro Asn He Asn He Asn He Tyr 
305 310 315 320 

AAA AAT AAA TTT AAA GAT AAA TAT AAA TTC GTT GAA GAT TCT GAG GGA 1008 
Lvs Asn Lys Phe Lys Asp Lys Tyr Lys Phe Val Glu Asp Ser Glu Gly 
325 330 335 

AAA TAT AGT ATA GAT GTA GAA AGT TTT GAT AAA TTA TAT AAA AGC TTA 1056 
Lys Tyr Ser He Asp Val Glu Ser Phe Asp Lys Leu Tyr Lys Ser Leu 
340 345 350 

ATG TTT GGT TTT ACA GAA ACT AAT ATA GCA GAA AAT TAT AAA ATA AAA. 1104 
Met Phe Gly Phe Thr Glu Thr Asn He Ala Glu Asn Tyr Lys He Lys 
355 360 365 

ACT AGA GCT TCT TAT TTT AGT GAT TCC TTA CCA CCA GTA AAA ATA AAA 1152 
Thr Arg Ala Ser Tyr Phe Ser Asp Ser Leu Pro Pro Val Lys lie Lys 
370 375 3B0 

AAT TTA TTA GAT AAT GAA ATC TAT ACT ATA GAG GAA GGG TTT AAT ATA 1200 
Asn Leu Leu Asp Asn Glu He Tyr Thr lie Glu Glu Gly Phe Asn He 
385 390 395 400 

TCT GAT AAA GAT ATG GAA AAA GAA TAT AGA GGT CAG AAT AAA GCT ATA 1248 
Ser Asp Lys Asp Met Glu Lys Glu Tyr Arg Gly Gin Asn Lys Ala He 
405 410 415 

AAT AAA CAA GCT TAT GAA GAA ATT AGC AAG GAG CAT TTG GCT GTA TAT 1296 
Asn Lys Gin Ala Tyr Glu Glu He Ser Lys Glu His Leu Ala Val Tyr 
420 425 430 

AAG ATA CAA ATG TGT AAA AGT GTT AAA GCT CCA GGA ATA TGT ATT GAT 1344 
Lys He Gin Met Cys Lys Ser Val Lys Ala Pro Gly He Cys He Asp 
435 440 445 

GTT GAT AAT GAA GAT TTG TTC TTT ATA GCT GAT AAA AAT AGT TTT TCA 1392 
Val Asp Asn Glu Asp Leu Phe Phe He Ala Asp Lys Asn Ser Phe Ser 
450 455 460 

GAT GAT TTA TCT AAA AAC GAA AGA ATA GAA TAT AAT ACA CAG AGT AAT 1440 
Asd Asp Leu Ser Lys Asn Glu Arg He Glu Tyr Asn Thr Gin Ser Asn 
46 5 470 475 480 

TAT ATA GAA AAT GAC TTC CCT ATA AAT GAA TTA ATT TTA GAT ACT GAT 1488 
Tvr He Glu Asn Asp Phe Pro He Asn Glu Leu He Leu Asp Thr Asp 

Y 485 490 495 
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TTA ATA AGT AAA ATA GAA TTA CCA ACT GAA AAT ACA GAA TCA CTT ACT 1536 
2u" lie Ser Lys lie Glu Leu Pro Ser Glu Asn Thr Glu Ser Leu Thr 
500 505 510 

rzvT TTT AAT GTA GAT GTT CCA GTA TAT GAA AAA CAA CCC GCT ATA AAA 1584 
£J A^n Sal Asp val Pro Val Tyr Glu Lys Gin Pro Ala lie Lys 
F 515 520 525 

*™ ITT TTT ACA GAT GAA AAT ACC ATC TTT CAA TAT TTA TAC TCT CAG 1632 
^SSS 25 Glu Asn Thr He Phe Gin Tyr Leu Tyr Ser Gin 

530 535 5 

1644 

ACA TTT CCT CTA 
Thr Phe Pro Leu 
545 



(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 548 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

Met Pro Val Thr He Asn Asn Phe Asn Tyr Asn Asp Pro lie Asp Asn 
1,5 10 

Asn Asn lie lie Met Met Glu Pro Pro Phe Ala Arg Gly Thr Gly Arg 
20 25 JU 

Tyr Tyr Lys Ala Phe Lys lie Thr Asp Arg lie Trp lie lie Pro Glu 

40 

Arg Tyr Thr Phe Gly Tyr Lys Pro Glu Asp Phe Asn Lys Ser Ser Gly 

50 55 
He Phe Asn Arg Asp Val Cys Glu Tyr Tyr Asp Pro Asp Tyr Leu Asn 

65 70 
Thr Asn Asp Lys Lys Asn He Phe Leu Gin Thr Met lie Lys Leu Phe 

Asn Arg lie Lys Ser Lys Pro Leu Gly Glu Lys Leu Leu Glu Met lie 

100 105 

lie Asn Gly He Pro Tyr Leu Gly Asp Arg Arg Val Pro Leu Glu Glu 

115 120 
Phe Asn Thr Asn lie Ala Ser Val Thr Val Asn-Lys Leu lie Ser Asn 

Pro Gly Glu Val Glu Arg Lys Lys Gly lie Phe Ala Asn Leu lie lie 

Phe Gly Pro Gly Pro Val Leu Asn Glu Asn Glu Thr He Asp lie Gly 

lie Gin Asn His Phe Ala Ser Arg Glu Gly Phe Gly Gly lie Met Gin 

180 185 
Met Lys Phe Cys Pro Glu Tyr Val Ser Val Phe Asn Asn Val Gin Glu 
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Asn Lys Gly Ala Ser lie Ph<* a 

Phe Asn Arg ^ Gly ^ phe ^ ^ 

Ala Leu He Leu Met His Glu Leu Il e His v a1 r 

225 230 " e Hls JJJ His Gly Leu Tyr 

Gly Ile Lys Val Asp Asp Leu Pro lie v-i n , 2 *° 
24S lie val P ro Asn Glu Lyg ^ ^ 

255 



Phe Met Gin Ser Thr Asn zn ~-. 

Asp Ala He ^ Ala Qlu Qlu ^ ^ ^ ^ 

270 

Gly Gly Gin Asp Pro Ser He lie Thr t> „ 

27 5 Hi Thr Pro Ser Thr Asp Lys Ser lie 

Tyr Asp Lys Val Leu Gin Asn Phe a™ ri _ _ 

290 295 *** GlY Ile ^al Asp Arg Leu Asn 

Lys Val Leu Val Cvs Tl c^,. * 

305 «*" Ser P« Asn lie Asn He Asn lie Tyr 

320 

Lys Asn Lys Phe Lys Asp Lys Tyr Lys Phe v.l m * 

325 ' Y bYa P J e Val Glu Asp Ser Glu Gly 

Lys Tyr Ser He Asp Val Glu Ser Ph. * r 

34o p Glu Ser Phe Asp Lys Leu Tyr Lys Ser Leu 

350 

-K - g Phe fc 01u „ jj. ne Mu ^ ^ L ^ L ^ 

* «. s„ ^ P6e s „ sat ^ ^ ^ » ^ ^ 

Asn Leu Leu Asp Asn Glu H e Tvr Thr- ti „, 

385 390 Thr Ile G * u Glu Gly Phe Asn He 

Ser Asp Lys Asp Met Glu Lys Glu Tyr Ara ni v P1 * 

405 71 ~9 G1 y Gin Asn Lys Ala He 



*- ^ «. «. Tyr ». 01u Ile s Lys „ Hls ^ ua " 5 



--- « AU fl i B Leu Ala Val Tyr 

430 



lie o; « ^ „. s « jjj Lys M , ^ My ^ ^ 

445 

Val Asp Asn Glu Asp Leu Phe Phe ti« *i * 

«0 P J™ Phe Ile Ala Asp Lys Asn Ser Phe Ser 



445 

460 

Asp Asp Leu Ser Lys Asn Glu Am n a n ^ 
465 Y 4? £ ^ AU ^ J le Glu Tyr Asn Thr Gin 



47 - Ser Asn 

* /:> 480 



Tyr lie Glu Asn Asp Phe Pro lie Asn n„ r 

485 X& ** n Leu «e Leu Asp Thr Asp 

495 

Leu Ile Ser Lys lie Glu Leu Pro Ser n„ » ^ 

500 U "° f?5 Glu Asn Thr Glu Ser Leu Thr 

3U:> 510 
Asp Phe Asn Val Asp Val Pro Val m T 

515 P * r ° £J ^ Glu Lys Gin Pro Ala lie L ys 

Lys Ile Phe Thr Asp Glu Asn Thr D ^ „i 

530 P ™£ Tilr Ile Pbe Gin Tyr Leu Tyr Ser Gin 



Thr Phe Pro Leu 
545 
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(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2616 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME/ KEY : CDS 

(B) LOCATION:!. -2616 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

^ Ann pan tth aac TAT AAG GAC CCT GTA AAC GGT 

S K S ^2 En £s £n 22 £n Tyr Lys Asp Pro Val Asn Oly 

GTT GAC ATT GCC TAC ATC AAA ATT CCA AAC GCC GGC CAG ATG CAG CCG 
S2 Sp Jle Ala Tyr He Lys He Pro Asn Ala Gly Gin Met Gin Pro 

20 25 

5SESS5SSSS3SS2SSSS 

50 55 

« rra PTT TCA TAC TAC GAT TCA ACC TAT CTG AGC ACA 

S SE "I? S Sal IS £ Tyr Asp Ser Tnr Tyr Leu Ser Thr 
65 70 

100 10b 

115 120 

„ .^r nnr Trr ATT AAC GTG ATC CAA CCA GAC GGT AGC TAC 

SS£K£££32£S£S "<= «» 5 " p Gly s " ** 

130 135 

~r~n TvTvr* r-rr rTA ATC ATC GGG CCC TCC GCG GAC ATT 

sss!ssjss™s s. «j p~ «p s; 

145 150 

180 185 



48 



96 



144 



192 



240 



288 



336 



384 



432 



480 



528 



S76 
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S E g E E SS3S S 5 . S E s a a 

205 

GGT GCA GGC AAG TTC GCA ACT rar nn, 

«y «. «, ^ £ S !S S S £ £ 2 s « s 

a £ £ s SJ a s a s ss s a jt r ~ - 

225 230 r y Ala Ile Asn Pro Asn 

240 

CGC GTG TTC AAG GTT AAC ACC a*r ™« 

- jg to s a a a a a s a a a 

255 

GAA GTA AGC TTC GAG GAA CTG err *nn 

-u v al s „ jj. Glu »„ a « a a a a a a s a 

270 

Esgsaaaaaaaaaaaa 

2 S 5 

SEaasaasaaasaasa 
Sssssaaaaaassaaa 

320 

TAT CTC CTA TCT GAA GAT ACA TCT nra aa, _ 

r - s " ss ^ *2?sS a a a a a 
asaaaaaa a a a ~ «« - « 

340 * C f* " Tnr G1 « He Tyr Thr Glu Asp 

350 

Esaasaaaaaaasaaa 
sE.aasaj-sasaaasaa 

380 

ACA ATA TAT GAT GGA TTT AAT Tra zn* 

Thr lie Tyr Asp Gly ™ J£ ™J ^ JJJ AAT TTA GCA GCA AAC 

385 390 u Asn Thr Asn Leu Ala Ala Asn 

_ 95 400 

EEEaasKssaaassaa 

410 415 

AAA AAT TTT ACT GGA TTC! ttt r»7i * m,._„ 

Lys Asn Phe Thr SJ E S SK K TGT GTA 

420 ^ Leu Cys Val Arg 

" 5 430 

435 y " Ser Le « Asp Lys Gly Tyr Asn Lys 

* V 445 1 

GCA TTA AAT GAT TTA TGT stp a»» 

Ala Leu Asn Asp Uu Cys Ue ST ^ ^ TG ° GAC ™ TTT TTT 

450 * LyS hys Val A sn Asn Trp Asp Leu Phe Phe 



624 
672 
720 
768 
816 
864 
912 
960 
1008 
1056 
1104 
1152 
1200 
1248 
1296 
1344 
1392 
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a™ r-iva GAT AAT TTT ACT AAT GAT CTA AAT AAA GGA GAA GAA 
S Pro 12 35 Z S Phe Thr Asn Asp Leu Asn Lys Gly Glu Glu 

465 470 

arT rat ATA GAA GCA GCA GAA GAA AAT ATT AGT TTA 
SSSS £ " S SS 1. JU «. Gl u Asn a e Ser U. 

485 

,™ pzxa TAT TAT TTA ACC TTT AAT TTT GAT AAT GAA CCT 

S 2 S Gin S£ ^ 2» Thr Phe Asn Phe Asp jjn Glu Pro 

„ mn * * rp rTT TCA A OT GAC ATT ATA GGC CAA TTA 

S2S2SS2222-. n. g. «, «. - 

515 52U 

at* GAA AGA TTT CCT AAT GGA AAA AAG TAT GAG 

2 K S S S 2S J£ 5 « » |jv «. ^ «- 

530 535 

m „ m nT P CAT tat CTT CGT GCT CAA GAA TTT GAA 

S S £ " S £ SI a »u cm «* «; 

545 550 

,„ rPT ACA AAT TCT GTT AAC GAA GCA TTA 

sssssssasi - ~ «- ° i » tou 

565 

_ p-- TAT aca TTT TTT TCT TCA GAC TAT GTA AAG 

Pro Ser Arg val Tyr Thr Phe Phe Ser Ser Asp Tyr Val Lys 

Leu Asn Pro ser Airg / 590 

580 3 " 

»«. PPT ACG GAG GCA GCT ATG TTT TTA GGC TGG GTA GAA 

$ s ^ si s s: «; io. « - - £v » v al =iu 

595 600 

aissssEssasssasss 

610 615 

5SSSSSSSSS2S.SSSS 

625 

S255S5SSSS2SgSSS 

660 

„ m ttt GCA CTT GTA TCA TAT ATT GCG AAT AAG 

S SS2SS2IS2; «1 S« Tvr gj Ala « L„ 

675 680 

ata PAT AAT GCT TTA AGT AAA AGA AAT GAA 

SSlSSSiSSSSS - se; «. - «» 

690 695 

S3SSSSS22SSSS22S 

705 710 



1440 



1488 



1536 



1584 



1632 



1680 



1728 



1776 



1824 



1872 



1920 



1968 



2016 



2064 



2112 



2160 



2208 
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GAA AAT CAA GCA GAA GCA ACA AAG GCT ATA ATA AAC TAT GAG TAT AAT 
Glu Asn Gin Ala Glu Ala Thr Lys Ala lie He Asn Tyr Gin Tyr Asn 

740 745 750 

CAA TAT ACT GAG GAA GAG AAA AAT AAT ATT AAT TTT AAT ATT GAT GAT 

Gin Tyr Thr Glu Glu Glu Lys Asn Asn He Asn Phe Asn He Asp Asp 

755 760 765 

TTA AGT TCG AAA CTT AAT GAG TCT ATA AAT AAA GCT ATG ATT AAT ATA 

Leu Ser Ser Lys Leu Asn Glu Ser He Asn Lys Ala Met He Asn He 

770 775 780 

AAT AAA TTT TTG AAT CAA TGC TCT GTT TCA TAT TTA ATG AAT TCT ATG 

Asn Lys Phe Leu Asn Gin Cys Ser Val Ser Tyr Leu Met Asn Ser Met 

785 790 795 800 

ATC CCT TAT GGT GTT AAA CGG TTA GAA GAT TTT GAT GCT AGT CTT AAA 

lie Pro Tyr Gly Val Lys Arg Leu Glu Asp Phe Asp Ala Ser Leu Lys 

805 810 815 

GAT GCA TTA TTA AAG TAT ATA TAT GAT AAT AGA GGA ACT TTA ATT GGT 

Asp Ala Leu Leu Lys Tyr He Tyr Asp Asn Arg Gly Thr Leu He Gly 

820 825 830 

CAA GTA GAT AGA TTA AAA GAT AAA GTT AAT AAT ACA CTT AGT ACA GAT 

Gin Val Asp Arg Leu Lys Asp Lys Val Asn Asn Thr Leu Ser Thr Asp 

835 840 845 

ATA CCT TTT CAG CTT TCC AAA TAG GTA GAT AAT CAA AGA TTA TTA TCT 

He Pro Phe Gin Leu Ser Lys Tyr Val Asp Asn Gin Arg Leu Leu Ser 

850 855 860 

ACA TTT ACT GAA TAT ATT AAG TAA 

Thr Phe Thr Glu Tyr He Lys * 
865 870 



2256 



2304 



2352 



2400 



2448 



2496 



2544 



2592 



2616 



(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 872 amino acids 
{B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

Met Gin Phe Val Asn Lys Gin Phe Asn Tyr Lys Asp Pro Val Asn Gly 
X 5 10 15 

Val Asp He Ala Tyr He Lys He Pro Asn Ala Gly Gin Met Gin Pro 
20 25 30 

Val Lys Ala Phe Lys He His Asn Lys He Trp Val lie Pro Glu Arg 



35 



40 



Asp Thr Phe Thr Asn Pro Glu Glu Gly Asp Leu Asn Pro Pro Pro Glu 
50 55 60 

Ala Lys Gin Val Pro Val Ser Tyr Tyr Asp Ser Thr Tyr Leu Ser Thr 

65 70 75 80 

Asp Asn Glu Lys Asp Asn Tyr Leu Lys Gly Val Thr Lys Leu Phe Glu 



85 



90 



Arg He Tyr Ser Thr Asp Leu Gly Arg Met Leu Leu Thr Ser He Val 



100 



105 
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Arg Gly lie Pro Phe Trp Gly Gly Ser Thr lie Asp Thr Glu Leu Lys 

115 120 
Val He Asp Thr Asn Cys He Asn Val He Gin Pro Asp Gly Ser Tyr 

130 135 
Arg Ser Glu Glu Leu Asn Leu Val He lie Gly Pro Ser Ala Asp lie 
145 150 

lie Gin Phe Glu Cys Lys Ser Phe Gly His Glu Val Leu Asn Leu Thr 

Arg Asn Gly Tyr Gly Ser Thr Gin Tyr lie Arg Phe Ser Pro Asp Phe 

Thr Phe Gly Phe Glu Glu Ser Leu Glu Val Asp Thr Asn Pro Leu Leu 

Gly Ala Gly Lys Phe Ala Thr Asp Pro Ala Val Thr Leu Ala His Glu 

210 215 
Leu lie His Ala Gly His Arg Leu Tyr Gly lie Ala lie Asn Pro Asn 

225 

^ val Phe Lys Val Asn Thr Asn Ala Tyr Tyr Glu Met Ser Gly Leu 

Glu Val Ser Phe Glu Glu Leu Arg Thr Phe Gly Gly His Asp Ala Lys 

, 260 ^ b!> 
Phe lie Asp Ser Leu Gin Glu Asn Glu Phe Arg Leu Tyr Tyr Tyr Asn 

275 ^ bU 
Lys Phe Lys Asp He Ala Ser Thr Leu Asn Lys Ala Lys Ser He Val 

290 ^ 
Gly Thr Thr Ala Ser Leu Gin Tyr Met Lys Asn Val Phe Lys Glu Lys 

Leu Leu Ser Glu Asp Thr Ser Gly Lys Phe Ser Val Asp Lys Leu 

L ys Phe Asp Lys Leu Tyr Lys Met Leu Thr Glu He Tyr Thr Glu Asp 
3£0 

Asn Phe Val Lys Phe Phe Lys Val Leu Asn Arg Lys Thr Tyr Leu Asn 

355 • 3b0 
Phe Asp Lys Ala Val Phe Lys He Asn He Val Pro Lys Val Asn Tyr 

370 3/ * 
Thr lie Tyr Asp Gly Phe Asn Leu Arg Asn Thr Asn Leu Ala Ala Asn 
3B5 390 

Phe Asn Gly Gin Asn Thr Glu He Asn Asn Met Asn Phe Thr g. Leu 

Lys Asn Phe Thr Gly Leu Phe Glu Phe Tyr Lys Leu Leu Cys Val Arg 

J 420 4 ^ 

Gly He He Thr Ser Lys Thr Lys Ser Leu Asp Lys Gly Tyr Asn Lys 

435 * 4U 

. en TGU cvs He Lys Val Asn Asn Trp Asp Leu Phe Phe 
Ala Leu Asn Asp Leu cys ue uy ^ 



450 455 
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Ser *o S « «, ^ j_ Phe Thr _ isp " ^ ^ 

480 

lie Thr Ser Asp Thr Asn lie Glu Ala A i» m „ n 

485 " Ala A J* Glu Olu Asn lie Ser Leu 

495 

Asp Leu He Gin Gin Tyr Tyr Leu Thr Ph„ > 

500 Y yr ieu Thr Phe As « Phe Asp Asn Glu Pro 

3 510 
Glu Asn He Ser He Glu Asn Le» e 

515 ASn J~ Ser Ser Asp H e n e Gly Gln Leu 

Glu Leu Met Pro Asn He Glu Ara Ph* Bm » ~, 

530 s £ ^ Phe Pr ° Asn Gly L ys Lys Tyr Gl u 

Leu Asp Lys Tyr Thr Met Phe His Tvr r=. » 

545 ss l Pfte Hls ^Vr Leu Arg Ala Gin Glu Phe Glu 

560 

His Gly Lys Ser Arg He Ala Leu Thr Asn Ser v»l a 

565 „ n ser Val Asn Glu Ala Leu 

/0 575 
Leu Asn Pro Ser Arg Val Tvr Thr Pho dv. * 

580 9 ^ Thr | h | Phe s er Ser Asp Tyr Val Lys 

Lys Val Jsn Lys Ala Thr Glu Ala Ala Met Phe Leu Gly Z Val Glu 

Cln Leu val Tyr Asp Phe Thr Asp Glu Thr Ser Glu Z Ser Thr Thr 

62 0 

Asp Lys He Ala Asp ll e Thr n« n ~ 

625 P 63 * Thr Ile »e lie Pro Tyr He Gly Pro Ala 

635 640 

Leu Asn He Gly Asn Met Leu Tvr t 

y g s 5 Met Leu Tyr Lys Asp Asp Phe Val Gly Ala Leu 

655 

a. «. Ser gj Ala v,l Iu Leu ft Glu ^ ^ ^ ^ 

003 670 
He Pro v,l _ Gly ^ ^ Leu y>1 ^ ^ Ma ^ ^ 

685 

« jj, a. val Gla Thr „ e ^ _ Ma ^ ^ ^ ^ ^ ^ 

Ts Trp Asp „„ V.l gj Lya ^ Ilo ^ J" ^ a ^ 

720 

Val Asn Thr Gin He Asp Leu Ile Ara rv= t „ 

725 * eu Arg Lys Lys Met Lys Glu Ala Leu 

Glu Asn Gin Ala Glu Ala Thr Lys Ala II* T i. * ^ 

740 y Ile Ile Tyr Gin Tyr Asn 

* s 750 

«. *r T S G l u „„ Glu Lys ^ ^ ne ^ ^ ^ ^ 

Leu Ser ser , ys Leu ^ ser u . ^ ^ m '" ne ^ ^ 

Asn Ly. P he Lsu G ,„ ^ s „ val Mr ™ bm ^ ^ ^ 

800 

lie Pro ^ G i„ vjl , ys to9 ^ Glu pte ^ Ua ^ 

a10 815 
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Asp Ala Leu Leu Lys Tyr He Tyr Asp Asn Arg Gly Thr Leu lie Gly 

820 825 
Gin Val Asp Arg Leu Lys Asp Lys Val Asn Asn Thr Leu Ser Thr Asp 

835 

He Pro Phe Gin Leu Ser Lys Tyr Val Asp Asn Gin Arg Leu Leu Ser 
850 855 

Thr Phe Thr Glu Tyr lie Lys * 

865 870 

(2) INFORMATION FOR SEQ ID NO: 27:' 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2574 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27:- 
ATGCCGGTTA CCATCAACAA CTTCAACTAC AACGACCCGA TCGACAACAA CAACATCATC 
ATGATGGAAC CGCCGTTCGC ACGTGGTACC GGTCGTTACT ACAAGGCTTT CAAGATCACC 
GACCGTATCT GGATCATCCC GGAACGTTAC ACCTTCGGTT ACAAACCTGA GGACTTCAAC 
AAGAGTAGCG GGATTTTCAA TCGTGACGTC TGCGAGTACT ATGATCCAGA TTATCTGAAT 
ACCAACGATA AGAAGAACAT ATTCCTTCAG ACTATGATGA AGTTATTTAA TAGAATCAAA 
TCAAAACCAT TGGGTGAAAA GTTATTAGAG ATGATTATAA ATGGTATACC TTATCTTGGA 
GATAGACGTG TTCCACTCGA AGAGTTTAAC ACAAACATTG CTAGTGTAAC TGTTAATAAA 
TTAATCAGTA ATCCAGGAGA AGTGGAGCGA AAAAAAGGTA TTTTCGCAAA TTTAATAATA 
XTTGGACCTG GGCCAGTTTT AAATGAAAAT GAGACTATAG ATATAGGTAT ACAAAATCAT 
TTTGCATCAA GGGAAGGCTT CGGGGGTATA ATGCAAATGA AGTTTTGCCC AGAATATGTA 
AGCGTATTTA ATAATGTTCA AGAAAACAAA GGCGCAAGTA TATTTAATAG ACGTGGATAT 
TTTTCAGATC CAGCCTTGAT ATTAATGCAT GAACTTATAC ATGTTTTACA TGGATTATAT 
GGCATTAAAG TAGATGATTT ACCAATTGTA CCAAATGAAA AAAAATTTTT TATGCAATCT 
ACAGATGCTA TACAGGCAGA AGAACTATAT ACATTTGGAG GACAAGATCC CAGCATCATA 
ACTCCTTCTA CGGATAAAAG TATCTATGAT AAAGTTTTGC AAAATTTTAG AGGGATAGTT 
GATAGACTTA ACAAGGTTTT AGTTTGCATA TCAGATCCTA ACATTAATAT TAATATATAT 
AAAAATAAAT TTAAAGATAA ATATAAATTC GTTGAAGATT CTGAGGGAAA ATATAGTATA 
GATGTAGAAA GTTTTGATAA ATTATATAAA AGCTTAATGT TTGGTTTTAC AGAAACTAAT 
ATAGCAGAAA ATTATAAAAT AAAAACTAGA GCTTCTTATT TTAGTGATTC CTTACCACCA 
GTAAAAATAA AAAATTTATT AGATAATGAA ATCTATACTA TAGAGGAAGG GTTTAATATA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
€00 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
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TCTGATAAAG ATATGGAAAA AGAATATAGA GGTCAGAATA AAGCTATAAA TAAACAAGCT 1260 

TATGAAGAAA TTAGCAAGGA GCATTTGGCT GTATATAAGA TACAAATGTG TAAAAGTGTT 1320 

AAAGCTCCAG GAATATGTAT TGATGTTGAT AATGAAGATT TGTTCTTTAT AGCTGATAAA 1380 

AATAGTTTTT CAGATGATTT ATCTAAAAAC GAAAGAATAG AATATAATAC ACAGAGTAAT 1440 

TATATAGAAA ATGACTTCCC TATAAATGAA TTAATTTTAG ATACTGATTT AATAAGTAAA 1500 

ATAGAATTAC CAAGTGAAAA TACAGAATCA CTTACTGATT TTAATGTAGA TGTTCCAGTA 1560 

TATGAAAAAC AACCCGCTAT AAAAAAAATT TTTACAGATG AAAATACCAT CTTTCAATAT 1620 

TTATACTCTC AGACATTTCC TCTAGATATA AGAGATATAA GTTTAACATC TTCATTTGAT 1680 
GATGCATTAT TATTTTCTAA CAAAGTTTAT TCATTTTTTT CTATGGATTA TATTAAAACT . 1740 

GCTAATAAAG TGGTAGAAGC AGGATTATTT GCAGGTTGGG TGAAACAGAT AGTAAATGAT 1800 

TTTGTAATCG AAGCTAATAA AAGCAATACT ATGGATAAAA TTGCAGATAT ATCTCTAATT 1860 

GTTCCTTATA TAGGATTAGC TTTAAATGTA GGAAATGAAA CAGCTAAAGG AAATTTTGAA 1920 

AATGCTTTTG AGATTGCAGG AGCCAGTATT CTACTAGAAT TTATACCAGA ACTTTTAATA 1980 

CCTGTAGTTG GAGCCTTTTT ATTAGAATCA TATATTGACA ATAAAAATAA AATTATTAAA 2040 

ACAATAGATA ATGCTTTAAC TAAAAGAAAT GAAAAATGGA GTGATATGTA CGGATTAATA 2100 

GTAGCGCAAT GGCTCTCAAC AGTTAATACT CAATTTTATA CAATAAAAGA GGGAATGTAT 2160 

AAGGCTTTAA ATTATCAAGC ACAAGCATTG GAAGAAATAA TAAAATACAG ATATAATATA 2220 

TATTCTGAAA AAGAAAAGTC AAATATTAAC ATCGATTTTA ATGATATAAA TTCTAAACTT 2280 

AATGAGGGTA TTAACCAAGC TATAGATAAT ATAAATAATT TTATAAATGG ATGTTCTGTA 2340 

TCATATTTAA TGAAAAAAAT GATTCCATTA GCTGTAGAAA AATTACTAGA CTTTGATAAT 2400 

ACTCTCAAAA AAAATTTGTT AAATTATATA GATGAAAATA AATTATATTT GATTGGAAGT 2460 

GCAGAATATG AAAAATCAAA AGTAAATAAA TACTTGAAAA CCATTATGCC GTTTGATCTT 2520 

TCAATATATA CCAATGATAC AATACTAATA GAAATGTTTA ATAAATATAA TAGC 2574 

(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 2574 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 

ATGCCAGTTA CAATAAATAA TTTTAATTAT AATGATCCTA TTGATAATAA TAATATTATT 60 

ATGATGGAGC CTCCATTTGC GAGAGGTACG GGGAGATATT ATAAAGCTTT TAAAATCACA 120 

GATCGTATTT GGATAATACC GGAAAGATAT ACTTTTGGAT ATAAACCTGA GGATTTTAAT 180 

AAAAGTTCCG GTATTTTTAA TAGAGATGTT TGTGAATATT ATGATCCAGA TTACTTAAAT 240 



PCT/GB97/02273 

WO 98/07864 



-113- 

ACTAATGATA AAAAGAATAT ATTTTTACAA ACAATGATCA AGTTATTTAA TAGAATCAAA 
TCAAAACCAT TGGGTGAAAA GTTATTAGAG ATGATTATAA ATGGTATACC TTATCTTGGA 
GATAGACGTG TTCCACTCGA AGAGTTTAAC ACAAACATTG CTAGTGTAAC TGTTAATAAA 
TTAATCAGTA ATCCAGGAGA AGTGGAGCGA AAAAAAGGTA TTTTCGCAAA TTTAATAATA 
TTTGGACCTG GGCCAGTTTT AAATGAAAAT GAGACTATAG ATATAGGTAT ACAAAATCAT 
TTTGCATCAA GGGAAGGCTT CGGGGGTATA ATGCAAATGA AGTTTTGCCC AGAATATGTA 
AGCGTATTTA ATAATGTTCA AGAAAACAAA GGCGCAAGTA TATTTAATAG ACGTGGATAT 
TTTTCAGATC CAGCCTTGAT ATTAATGCAT GAACTCATCC ACGTCCTCCA CGGTCTCTAC 
GGTATCAAAG TAGACGACCT CCCGATCGTC CCGAACGAAA AAAAATTCTT CATGCAGAGC 
ACCGACGCAA TCCAGGCAGA AGAACTCTAC ACCTTCGGTG GTCAGGACCC GAGCATCATC 
ACCCCGAGCA CCGACAAAAG CATCTACGAC AAAGTCCTCC AGAACTTCCG TGGTATCGTC 
GACCGTCTCA ACAAAGTCCT CGTCTGCATC AGCGACCCGA ACATCAACAT CAACATCTAC 
AAAAACAAAT TCAAAGACAA ATACAAATTC GTCGAAGACA GCGAAGGTAA ATACAGCATC 
GACGTCGAGA GCTTCGACAA ACTCTACAAA AGCCTCATGT TCGGTTTCAC CGAAACCAAC 
ATCGCAGAAA ACTACAAAAT CAAAACCCGT GCAAGCTACT TCAGCGACAG CCTCCCGCCG 
GTCAAAATCA AAAACCTCCT CGACAACGAA ATCTACACCA TCGAAGAAGG TTTCAACATC 
AGCGACAAAG ACATGGAAAA AGAATACCGT GGTCAGAACA AAGCAATCAA CAAACAAGCT 
TACGAAGAAA TCAGCAAAGA ACACCTCGCA GTCTACAAAA TCCAGATGTG CAAAAGCGTC 
AAAGCACCGG GTATCTGCAT CGACGTTGAC AACGAAGACC TCTTCTTCAT CGCAGACAAA 
AACAGCTTCA GCGACGACCT CAGCAAAAAC GAACGTATCG AATACAACAC CCAGAGCAAC 
TACATCGAAA ACGACTTCCC GATCAACGAA CTCATCCTCG ACACCGACCT CATCAGCAAA 
ATCGAACTCC CGAGCGAAAA CACCGAAAGC CTCACCGACT TCAACGTTGA CGTCCCGGTC 
TACGAAAAAC AGCCGGCAAT CAAAAAAATC TTCACCGACG AAAACACCAT CTTCCAGTAC 
CTCTACAGCC AGACCTTCCC GCTAGATATA AGAGATATAA GTTTAACATC TTCATTTGAT 
GATGCATTAT TATTTTCTAA CAAAGTTTAT TCATTTTTTT CTATGGATTA TATTAAAACT 
GCTAATAAAG TGGTAGAAGC AGGATTATTT GCAGGTTGGG TGAAACAGAT AGTAAATGAT 
TTTGTAATCG AAGCTAATAA AAGCAATACT ATGGATAAAA TTGCAGATAT ATCTCTAATT 
GTTCCTTATA TAGGATTAGC TTTAAATGTA GGAAATGAAA CAGCTAAAGG AAATTTTGAA 
AATGCTTTTG AGATTGCAGG AGCCAGTATT CTACTAGAAT TTATACCAGA ACTTTTAATA 
CCTGTAGTTG GAGCCTTTTT ATTAGAATCA TATATTGACA ATAAAAATAA AATTATTAAA 
ACAATAGATA ATGCTTTAAC TAAAAGAAAT GAAAAATGGA GTGATATGTA CGGATTAATA 
GTAGCGCAAT GGCTCTCAAC AGTTAATACT CAATTTTATA CAATAAAAGA GGGAATGTAT 
AAGGCTTTAA ATTATCAAGC ACAAGCATTG GAAGAAATAA TAAAATACAG ATATAATATA 
TATTCTGAAA AAGAAAAGTC AAATATTAAC ATCGATTTTA ATGATATAAA TTCTAAACTT 



300 

360 

420 

480 

540 

600 

660 

720 

780 

840 
900 
960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 



WO 98/07864 

PCT/GB97/02273 



AATGAGGGTA TATAGATAAT ATAflATAAIT TTATAAATGG „ 

^ ™» cn^T« T 

ra ~ BBO ~* TACTTGAAAA ccj^ 

— c_ MracTaiIS M ^ 



2340 

2400 

2460 

2520 

2574 



WO 98/07864 



PCT/GB97/02273 



-115- 

CLAIMS 

1. A polypeptide comprising first and second domains, wherein said first 
domain is adapted to cleave one or more vesicle or plasma-membrane associated 
proteins essential to exocytosis, and wherein said second domain is adapted (i) to 
translocate the polypeptide into a cell or (ii) to increase the solubility of the 
polypeptide compared to the solubility of the first domain on its own or (Hi) both 
to translocate the polypeptide into a cell and to increase the solubility of the 
polypeptide compared to the solubility of the first domain on its own, said 
polypeptide being free of clostridial neurotoxin and free of clostridial neurotoxin 
precursor that can be converted into toxin by proteolytic action. 

2. A polypeptide according to Claim 1 wherein said first domain comprises a 
clostridial toxin light chain. 

3. A polypeptide according to Claim 1 wherein said first domain comprises a 
fragment or variant of a clostridial toxin light chain. 

4. A polypeptide according to Claim 2 or 3 wherein the clostridial toxin is a 
botulinum toxin. 

5. A polypeptide according to any preceding claim wherein the first domain 
exhibits endopeptidase activity specific for a substrate selected from one or more 
of SNAP-25, synaptobrevin/VAMP and syntaxin. 

6. A polypeptide according to any preceding claim wherein said second domain 
comprises a clostridial toxin heavy chain H N portion. 

7. A polypeptide according to any of Claims 1 -5 wherein said second domain 
comprises a fragment or variant of a clostridial toxin heavy chain H N portion. 

8. A polypeptide according to Claim 6 or 7 wherein the clostridial toxin is a 



WO 98/07864 

PCT/GB97/02273 

- 116- 

botulinum toxin. 

8. A polypeptide according ,o any of Claim, ,. 8 further C o mprisin!I . 
oma ; n adapted for binding o, the polypeptide to . C e, by £ 

11. A pdypeptide according to Claim ,0 wherein said third domain is a tandem 
repeat photic l gQ binding domain derived from domain , of Staphyloc Z 

12. A polypeptide according to c„i m 9 wherein sa id third domain comprise, an 
am,noac,d sequence that binds , o a cell surface receptor. 

1 3. A polypeptide according to Claim 1 2 wherein said third domain is insuiin-lilce 
growth factor- 1 (IGF-1). Ke 

,lchTT ideacc ^^ 

ngh, ch ,n or a fragment or a variant of a botulinum toxin light chain and a portion 
des,gnated H„ of a ootulinum toxin heavy chain. 

11 A polypeptide according to Claim ,4 wherein one or both of ,„ ,h. toxin 
*gh, cha,n or fragment or variant of toxin light chain and ,b, the portion of the toxin 
heavy chain are of botulinum toxin type A. 

1* A poiypeptide according to Cairn 15 wherein the botulinum toxin type A 
l. 9 h, cha,n variant has a, residue 2 a glutamate. a, residue 26 a .ysine and a, 
residue 27 a tyrosine. 
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17. A polypeptide according to Claim 14 wherein one or both of (a) the toxin 
light chain or fragment or variant of toxin light chain and (b) the portion of the toxin 
heavy chain are of botulinum toxin type B. 

1 8. A polypeptide according to any of Claims 1-13 comprising a botulinum toxin 
light chain or a fragment or a variant of a botulinum toxin light chain and at least 
100 N-terminal amino acids of a botulinum toxin heavy chain. 

19. A polypeptide according to Claim 1 8 comprising a botulinum toxin type B 
light chain, or a fragment or variant thereof , and 107 N-terminal amino acids of a 
botulinum toxin type B heavy chain. 

20. A polypeptide according to Claim 1 5 or 1 6 comprising at least 423 of the N- 
terminal amino acids of botulinum toxin type A heavy chain. 

21 A polypeptide according to Claim 20 comprising a botulinum toxin type A 
light chain and 423 N-terminal amino acids of a botulinum toxin type A heavy 
chain. 

22. A polypeptide according to Claim 20 comprising a botulinum toxin type A 
light chain variant wherein residue 2 is a glutamate, residue 26 is a lysine and 
residue 27 is a tyrosine, and 423 N-terminal amino acids of a botulinum toxin type 
A heavy chain. 

23. A polypeptide according to Claim 17 comprising at least 417 of the N- 
terminal amino acids of botulinum toxin type B heavy chain. 

24. A polypeptide according to Claim 23 comprising a botulinum toxin type B 
light chain and 417 N-terminal amino acids of a botulinum toxin type B heavy 
chain. 

25. A polypeptide according to any of Claims 1 4-24 lacking a portion designated 
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H c of a botulinum toxin heavy chain. 

26 A polypeptide comprising . botu|jnum toxjn |igh( cha . n 

surface receptors. y 

27 a polypeptide accordi„ 9 ,„ Cairn 26 lacking an intac, portion designated H 
of a botulinum toxin heavy chain. c 

28 A p olypeptide aocording „ any preced . ng o|a . m ^ 
c-ostnd.a, toxin and further comprising . site for cleavage „ y . 
»""<=h cleavage site is not present in, he nativa toxin. 

29. A polypeptide according to Claim 28 comprising . variant of a clostridial 
<ox,n l,gh, cha,n and further comprising a site for cleavage by a proteose en Z vme 
wh,ch cleavage site is not present in the native toxin light chain. 

30 A polypeptide according to Claim 28 or 29 comprising a variant of a 
clostnd-a, ,ox,n heavy chain H N portion and furtter comprising a site for cleavage 
by a proteolytic enzyme, which cleavage site is not present in th. native JL 
heavy chain H N portion. 

31. A poiypeptide according to Claim 28. 29 or 30 obtainable by modification 
of a DNA encoding the polypeptide so as to introduce one or more nuclides 

coding for the cleavage site. 

32^ A fusion protein comprising a fusion of (a) a polypeptide according to any 
of C,a,ms 1-31 with ,b, a second polypeptide being a polypeptide o, oligopeptide 
adapted for binding to an affinity matrix so as to enable purification of the fusion 
protein using said matrix. 



33. 



A fusion protein according to Claim 32 wherein said second polypeptide is 
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adapted to bind to a chromatography column, such as an affinity matrix of 
glutathione Sepharose. 

34. A fusion protein according to Claim 32 or 33 wherein a specific protease 
cleavage site is incorporated between the first and second polypeptides, said 
protease site enabling proteolytic separation of first and second polypeptides. 

35. A composition comprising a derivative of a clostridial toxin, said derivative 
retaining at least 10% of the endopeptidase activity of the botulinum toxin, said 
derivative further being non-toxic in vivo due to its inability to bind to cell surface 
receptors, and wherein the composition is free of any component, such as toxin or 
a further toxin derivative, that is toxic in vivo. 

36. A composition according to Claim 35 or a polypeptide according to any of 
Claims 1-31 or a fusion protein according to Claim 32, 33 or 34 for use as a 
positive control in a toxin assay. 

37. A composition according to Claim 35 or a polypeptide according to any of 
Claims 1-31 or a fusion protein according to Claim 32, 33 or 34 for use as a 
vaccine against clostridial toxin. 

38. A composition according to Claim 35 or a polypeptide according to any of 
Claims 1-31 or a fusion protein according to Claim 32, 33 or 34 for in vivo use. 

39. A pharmaceutical composition comprising a composition according to Claim 
35, a polypeptide according to any of claims 1-31 or a fusion protein according to 
Claim 32, 33 or 34, in combination with a pharmaceutical^ acceptable carrier. 

40. A nucleic acid encoding a polypeptide or a fusion protein according to any 
of Claims 1-34. 



41 



nucleic acid encoding a polypeptide or a fusion protein according to Claim 
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rsjzr* nuc,eotides encodin9 '- 448 - ■ « - 

42 A nucleic acid according «c Calm 40 or 4, comprising nucleotides encoding 
res.dues 1-423 of a botuiinum toxin type A heavy chain H„ domain. 

43. A nucleic acid encoding a polypeptide or a fusion protein according to Cain, 
40 and composing nuclides encoding residues ,-470 of a botuiinum toxin type 

B light chain. yK 

44. A nucieic acid encoding a polypeptide or a fusion protein according ,„ Claim 
40or 43 compr,s,ng nuclides encoding residues 1^,7 of, botuiinum Lntyp 
B heavy chain H N domain. M 

45. A nucleic acid according to any of Cairns 40-44 comprise nucleotides 
encod,ng a restriction endonuclease cleavage site no, present in native clostiidia, 

toxin sequence. 

46 A nucleotide according to Claim 45 obtainable by modification of a 
nucieotide encoding a polypeptide or fusion protein according to any of claims 1 -34 
so as to introduce said cleavage site. 

47. A DNA according to any of claims 40-46. 

48. A DNA selected from SEQ ID Mors 1 . 8. 1 0. 1 2. 1 4, 1 6. 1 8. 23 and 24. 

49. A method of manufacture of a polypeptide according to any of Claims 1-31 
comprising expressing in a host cell a nuc.eic acid according to any of Cairns 40- 
48 and recovering the polypeptide. 

50. A method of manufacture of a polypeptide according to any of Claims , -31 
comprising expressing in a host eel, a nucleic acid encoding a fusion protein 
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according to Claim 32, 33 or 34, purifying the fusion protein by eluting th fusion 
protein through an affinity matrix adapted to retain the fusion protein and eluting 
through said matrix a ligand adapted to displace the fusion protein, and recovering 
the fusion protein. 

51 . A method of manufacture according to Claims 49 or 50 in which the nucleic 
acid is DNA. 



52. A cell expressing a polypeptide or fusion protein according to any of Claims 
1-34. 
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